Add to Watchlist
An Introduction to Deep Learning
125 views
Citation of segment
Embed Code
Formal Metadata
Title  An Introduction to Deep Learning 
Title of Series  EuroPython 2016 
Part Number  166 
Number of Parts  169 
Author 
French, Geoffrey

License 
CC Attribution  NonCommercial  ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. 
DOI  10.5446/21145 
Publisher  EuroPython 
Release Date  2016 
Language  English 
Content Metadata
Subject Area  Computer Science 
Abstract  Deep learning: how it works, how to train a deep neural network, the theory behind deep learning, recent developments and applications.  (length: 60 mins) In the last few years, deep neural networks have been used to generate state of the art results in image classification, segmentation and object detection. They have also successfully been used for speech recognition and textual analysis. In this talk, I will give an introduction to deep neural networks. I will cover how they work, how they are trained, and a little bit on how to get going. I will briefly discuss some of the recent exciting and amusing applications of deep learning. The talk will primarily focus on image processing. If you completely new to deep learning, please attend T. Rashid's talk 'A Gentle Introduction to Neural Networks (with Python)'. His talk is in the same room immediately before mine and his material is really good and will give you a good grounding in what I will present to you. 
Series
Annotations
Transcript
00:00
I could there's an introduction to deep learning so bispectrum jet it commodity than succumbing
00:21
Canada let's start by thanking Interreg Rashid the going is excellent gentle introduction to no networks on and build upon that and hopefully to hopefully here show you how to um develop some of the networks have been used to get to the really good computer vision results that we've seen recently so of focuses mainly gonna be on
00:41
image processing this morning and this talk is come although principles and mass behind it in the code and the reason is is quite a big topic is colored go through and NGOs screws and an outside the littlest code that had a this useful so as a quick overview organ goes through we are going discusses the honor library which is the 1 I personally use all of also love that tends to flow and we're gonna cover the basic model of what
01:11
is in our network just building on reeks told they're are going to have a car go through convolutional networks and these similar networks that get in a really really good results that we've seen recently the model briefly lasagna which is another Python library built
01:28
into the honor to make it easier to build new on x will discuss why is there and what it does and then I'll give you a few hints about how to actually build on your network you now have she structure what led us to choose um just to say you have a rough idea how to train them the of just a few contends and tips to practically get going and finally uh time permitting i'll go through through the Oxford net feature gene
01:53
network which is how to use a pretrained that was the you can download on the Creative Commons from node to the University of use that yourself because all go through why it's useful sometimes users and there whether somebody else's trained for you and tweak if own purposes now the
02:12
nice thing is there's some 12 materials this is the based
02:15
of the tutorial I get up I data London in May and if you check out the gate there right fury deep learning uh tutorial data 2016 you'll find that there's they get high rep either all my works of you belong get help so you should go to see you everything there in your browser I would ask the the please please please do not try and run
02:38
this code during the talk and the reason is is because when you run the stuff that uses the VGA Oxford models that will need to download a 500 makeweights file and you will kill WiFi fuel start doing so please it on your own time such big X and also uh yeah
02:58
this if you want to get more in depth about the Amazon here up up some slides we
03:03
check that 1 speaker dep profile will be the to the B. This talk slides and will also be intuitively on a lasagna as well so that will give you a break down of Python code using the animals idea what it does and how to use it um and furthermore if you don't have a machine available you don't wanna set of itself there are set up and uh MI for you so if you wanna go use 1 of their
03:28
GP use you can go and grab a hold of that and the Ronald code there everything's all set up and I hope it's relatively easy to get into alright now to get into the
03:39
meat of the talk and what better place to start than image imagine that there is an academic image classification datasets we got round million images I think it might be even more now the divide into a thousand different classes so you got various different types of Belarus different poses of at flowers buckets whatever else swallowing came up with rocks males and the
04:04
way the ground truths as in you got a bunch of images that means greater flicker and you got of what a ground truth for each image is and while was
04:12
prepared as they went and got some people to do over terms in Mechanical Turk now the top 5
04:18
challenge we've got do is you go to produce a classifier that when a given an image will produce a a probability score
04:26
what it thinks it is and you score heads if the ground truth class therefore true class is someone within your within your molecule or um in the know network or whatever is used its top 5 choices for what it thinks the images and in
04:41
2012 the best approaches of the time used a lot of handcrafted features for those if you move computer vision is quite SIFT hogs Fisher vectors on my
04:51
stick into classifier maybe even a classifier on the top5 error rate was around 25 % and then the game changed because of the
05:02
system and hence and in that paper envision that classification with deep convolutional neural networks that a mouthful they manage to get error down to 15 % and the last 3 years more modern network architectures the goddamn further now a dance about 5 to 7 % and I think people a Microsoft even got down to 3 or 4 I hope that this talk is going to give an idea of how it's done
05:27
OK let's have a quick clicker run over the unknown no networks offer
05:33
comes in 2 flavors was kind on a spectrum really you got the kind your never toolkits the quot high level 1 and the other and you go expression compilers and and you nematode please specify the new network in terms of layers with expression compilers this somewhat lower level and can describe the mathematical expressions that tree recovered that affect behind the layers that effectively describe the network and some more powerful and flexible approach the annotation expression compiler you can run NumPy style expressions I going to compile it to be the same
06:12
to on your CPU all CUDA 2009 NVIDIA GPU if you have 1 of those available and I once again
06:20
if you wanna go to have again and again to that there must slides so that I mentioned earlier there's a lot more there's a lot more to the other so go check out the deep learning . net website to learn more about it and find out about that like is a full description the API and everything will
06:35
do uh some of which you may want to use there are of course others there's
06:40
tends flow developed by Google Maps gaining popularity really fast these days of them I will be the future will see OK what is a
06:48
neural network well we're gonna cover a fair bit about 3 covered in the previous talk but it's got multiple layers and the data propagates through each layer and is transformed as it goes through so we must all travel the image of a bunch of bananas is going to go through the 1st hidden layer and get transformed into a different representation only a transformed against the next hidden layer and finally we end up with a similar we're doing image classifier we end up with a probability vector effectively all values in that middle
07:21
son to 1 and all predicted so uh out pretty into class is the corresponding row in the probability that and the probability that Robert has the
07:32
highest probability OK and this is 1 that a kind of looks like we see there are weights that you saw in the previous talk uh that connect all of potential use between the layers and see data being put in on the input propagating through an arriving in the output breaking down a single layer of a neural network we've got our input which is basically it is a vector of an array of numbers multiplied by a weight matrix which is the crazy lines and then we have a bias term which is simply an offset use out of vectors and then we have our
08:08
activation function or nonlinearity those terms are roughly interchangeable and that's that's the upper layer
08:15
activations will then goes into the next slide all the output of this the last letter in the network mathematically speaking x sample vector what is the output we represent all weights by about by their weights matrix that's 1 of the parameters of our network all parameters the bias got a molarity function and normally these days that's really rectified linear units it's about as simple as they come it simply maps of x is 0 that is the most that's the part that's the activation function has become the most popular recently In a Nutshell y equals f w x + b repeated for each layer as a goes through and that's basically in your network just that same formula repeated over and over once for each letter and so would make a misclassified would take the pixels from our image when sperm onto a vector stretch them out rowbyrow bonds the network and get a result so in summary no no work is built from layers each of which is and matrix multiplication then and bias the plan nonlinearity OK and how to train a neural network we've got to learn values for all parameters the weights and the biases for every letter and for
09:48
that we use backpropagation we're gonna initialize always
09:52
randomly that with more on this later we're going initialize the biases alters 0 and then for each example our training set you want to value has to make said we got to evaluate on on average prediction seawater reckons the output is it compared to the other the actual training the actual training out what it should produce given the input we got to measure of cost function which
10:17
is roughly speaking the error that's the difference between water network is predicting what should
10:24
predict the groundtruth output no the cost function is kind of important so we'll just go and discuss how little bit the classification where the idea is given an input and a bunch of categories which category best describes this input of final we use a function called softmax as or nonlinearity are nonlinear linearity or activation function and outputs a vector of
10:53
class probabilities the best way of thinking about it is that say I've got a bunch of numbers must some mall up and I divide each element by the sun value roughly the proportion or probability assuming all of our numbers to start with a positive but they
11:09
can also go negative your network so what we do with the softmax is at 1 little wrinkle the full summing up of what we do is we take our input numbers we compute the exponent of the mall namely someone not we divide the exponent by and the
11:22
exponents that's softmax and cost function function is negative log likelihood also categorical crossentropy 0 and to do that you gotta take the log of u if you to say you have an image of a dog you
11:40
take you around the image to the network you see what the predictive probability is the dog you take the log of the probability which is going to be negative if it's break the probabilities was then a lot of that's going 0 if it's like 0 . 1 is can be quite strongly negative indicate that seem again that's log and so the idea is that if if it's supposed to output dog it should give a probability of 1 if it's giving a probability of less than that the log the negative log of the quite positive which indicates high error so that's your cost the
12:12
regression is is different but rather than classifying an input and saying which category closely matches this try quantify the measuring strength of something will strength some response typically
12:25
without a final and doesn't have an activation function is just the identity linear annual cost is going to be sum of squared differences then what we're
12:37
gotta do with unknown I would could reduce the cost reduced the error using gradient descent more we have to do as we have to compute the derivative of the gradient of the cost with respect to all parameters which is all always smaller biases within our letters the cool thing about it is the iid does the symbolic differentiation for you I can tell you right now that you don't wanna be in a situation where you have this massive expression for you on your network and you've got to go and compute the derivative of the cost to the spent some parameter by hand because you will make a mistake you will flip a minus sign somewhere and then you'll never want learn and debugging will be a goddamn nightmare because we really help to figure out where it's gone wrong yeah I didn't I would recommend getting any m a symbolic and symbolic mathematical package for you will use some of the on is handles at all and actually write that code there
13:34
because by the way this is the undergrad course weight and mother told his do this as well just to save you time and sanity
13:43
and many of their your parameters you take your weights and you subtracts the learning rates which is lambda and multiply by the gradient and and i'd Stanley learning rate should be somewhere in the region of 1 1 times 10 to the minus 4 1 times 10 to the minus 2 something in that region you also going you typically don't train 1 example that once you got tables where minibatch about 100 samples through dataset you're going to compute the costs the
14:18
each of those samples Abergel cost together and then compute the derivative average cost with respect to all of your parameters um and the ideas you know with average in that so the idea is that that means that you get about 100 samples
14:34
persistent parallel on them and we run on a GPU that tends to speed things up a lot because it uses all of the the parallel processing power of GPU training on all the examples in your entire training set is called an epoch and of want multiple epochs of training never some not or 300 so In summary take a minibatch training samples evaluate running through the network measure the average error error called cost across the minibatch and use gradient descent to modify the parameters to reduce cost and
15:11
the baby above and all right
15:16
multilayer perceptron this is the simplest neuron network architecture there's nothing we haven't seen so far he is only 1 and is fully connected all dense layers in a densely each unit is connected to every single units in the previous layer and to carry on from To can pick up from 3 to talk and then missed the analyst handwritten digits dataset is a rose is a good good place to start and on their work with 2 hidden layers but the 256 units after 300 iterations gets about 1 . 8 3 % validation Sarah so a is about 98 . 1 7 % accuracy which is pretty good
16:08
however these hundred digits acquires special case all
16:14
digits nicely center when the image there or for the same position scale to about the same size and you can see that the examples that and are fully connected networks have 1 weakness has no translational invariance if margin you wanna like take emission detectors go detector bowl somewhere in the image uh um what effectively means is you know it's let me learn to pick it up pick up the ball in the position where it's been seen so far it won't learn to generalize across all positions in the image and 1 of the cool things we can do is if we take the the weights that we learn and we say take we take and which 1 of the new 1 of the units in the 1st hidden layer
16:57
and take the strengths of the weights that link to all pixels in the input layer and this that's what you end up with so you see that your 1st hidden layer weights offered to for bundle feature detectors that detect pick up the various strokes that make up the digits and so is kind of cool too busy lies above that shows you how the dense layers are translationally dependent and and so for
17:22
general imagery like say contact cats dogs various spiders and everything that makes up versatile creatures mortgage things you gotta have a train set large enough to have every single possible feature in every single location of all the images and you will have a network that's got enough units to represent all this variation um the Casey again to have a training set and the trillions and your network with billions and billions of nodes and you don't have enough again the enough about all the computers in the world and the death of the universe more training so material convolutional networks is how we address that convolution it's a fairly common operation in computer vision and signal processing because a slut a common convolutional kernel
18:11
over the image and what you do is imagine say the image pixels on 1 they don't tell your kernel which is got a bunch of a bunch of little weights sponsible values multiply the value in the kernel by the pixel underneath it the Frawley for all the for all values in the kernel you know take those those but those um parts and some all up in the slide the kernel 1 position to the same slide this do you want to do the
18:40
same and what you end up with is and outputs and
18:46
while often used for feature detection so brief dates a ball filters if we produce these bunch of the these these filters which product over the sine wave and the uh gas function in that what is the loss of usable way visible soft circular way things and if you do convolutional see that they act as a feature detector that detects certain features in the image so you can see are roughly corresponds you can see the 1 of the vertical bars there roughly pick out the vertical lines in the image of the anonymous the horizontal bars pick out the horizontal lines and you can see how convolution convolutions act as a feature detector and that was used pretty much they is quite a lot for that too bad contract to convolutional networks bank public and forget recapped that's a fully connected that looks like with all of our input connected to all of our outputs In a convolutional layer you'll notice of a note on the right is only connected to a small neighborhood of nodes on the left and the next node down as catch to small corresponding neighborhood the weights also shared so that means use the same value for all the red weights and for greens and for all the others and the values of these weights for that kernel that features of a practical computer vision worthy of these kernels manly learning like in a
20:16
convolutional network more than 1 kernel has to be used because you got extract a variety of features is not office not sufficient just to build a gentle of just the horizontal edges you want at vertical ones and all the other various orientations and size as well so you have a range of kernels so you're going have
20:36
different weight kernels and the output the ideas he got image that 1 channel only input and about 3 channels on the output or you might find in the tube you convolutional memory manager of 48 channels all 256 face examples later some architectures in the inner part of the scene for a high dimensionality in this sort of the channels up of and OK stage calculates all pixels image or text to the pixels and all channels in the previous lesson draws in the doors dates from all channels in the previous layer however the maths still the same and the reason is because a convolution can be expressed as a multiplication by weight matrix is just the way major square sparse but we know the
21:21
mass doesn't really change as far as uh conceptually and I was fortunate for us because it means that the gradients and having done so far this still works and as for how how you go about figuring that out I just recommend lengthy energy for you that wouldn't be
21:39
myself when recommended there's 1 more thing we need downsampling so typically if you don't if you've worked in further Balkan point
21:48
is an imageediting factors on shrink an image down by certain amount say about 50 %
21:54
um you watching resolution and for that we use to operation the the max pooling of striving max
22:01
pooling what you can do is you can see G. level image up there is divided into 4 color blocks say the blue block has 4 pixels what we do is we take this for pixels we picked on the maximum value we use that's Robin averaging just take the maximum and that's max pooling and the downsamples the image by by the fact that if p is thus of size of the pooling um that operates on the channel independently the other option is striding and what we do there is effectively pick a sample skip a few pick sample skip a few it's it's even simpler this is
22:40
often quite a lot faster because what you can do so the the convolution operations support Strider convolutions were rather know rather than
22:48
taking sort of them in the output inferring some way that is to effectively jump over by the pixel each time so that's faster and you get similar results so moving on yeah the
23:02
current used convolutional networks to currently used to solve the image the and this status in 1995 and this is a simplified version of its architecture so we've got is you've got to 20 kernels you got was 20 between input image 1 channel this is monochrome we got 20 kernels 5 by 5 so they would use the Mr. 24 with 24 that's not 20 channels maxpool shrink by half have 50
23:33
kernels 5 by 5 now
23:36
go 50 channel image a bait med school shrink it by half we found mentioned uh do a fully connected densely 2 2 5 6 units and finally fully connected to all 10 unit output layer from from past cruelties after
23:53
Federation of the training set we get 99 21 % accuracy 1 7 months
23:59
and there is not too bad what about learned kernels this interesting to think about what the feature detectors is picking up so if we look at say a big dataset emission at this is that this is the preserve the paper I mentioned by the beginning these are the kernels they learned by the
24:16
neural network and for comparison a considerable over there no the reason the columns of the bottom is just because of the way they did the actual thing involving 2 ways but
24:26
up but if you look at the top right you can
24:28
see how it's picked up always versatile edge detectors of various sizes and orientations that's the 1st letter uh uh either Fergus to further and they figured out a way of busy lies in the kernels how they respond to the 2nd so you can see the book cells that respond to various of sort more complex features then squares and you kind of the this curve the texture Wilson online
24:54
features of circular features and further up on about layer 3 you get somewhat more complex features
25:01
still we've got things that repres River recognize parts of simple parts of objects the case this gives you an idea of roughly how the convolutional networks fit together uh the operators of the feature detectors where each letter builds on the previous 1 picking up ever more complex features OK now
25:23
move on to lasagna if you are
25:27
specifying that what using the mathematical expressions using the and it's really powerful was quite low level if you have to write
25:34
I on your network mathematical expressions and on the expression each time it could get a painful but was only built on
25:40
top of it and makes it nicer to build networks using the
25:46
and its API rather than slide to specify mathematically expressions you can construct layers of the network and the Council and get the expressions for the output for its output loss this Python layer on top of the utterances with them so in the end but the cool thing about it is if you have 1 of these mathematical expression compiler if you are constant some crazy new loss function will do something new and crazy whatever years you like all you wanna beats of inventive you can just go right of maps and let the honor take care of figuring out how to run that using CUDA using uh nvidia's CUDA and what part worry about itself is quite easy to get going as old as you just do all in Python Annals what's great so that's that's what happened to like it um from once again uh slides available there you want to go go and dive in more detail as for how to build and train neural
26:42
networks having starts out with
26:46
a bit about the architecture if you want to know what I want to use it to get a nice new I was gonna work going to try and give you some rough ideas of all colors you wanna use and where the nose to get some this guy is going to give you good results so you're a part of the
27:02
network is going to be
27:04
stuff your input layer is going to be blocks going to consist of some number of convolutional layers to 3 for convolutional layers followed by maxpooling where that effectively down
27:16
samples alternatively you can also use starting as well and we have
27:21
another block the same and you'll notice that limitation is that's quite common never in the academic literature is you specify the number of filters the number of the number of the number of kernels and then the 3 specifies the the size they often used quite small
27:37
filters in the 3 by 3 kernels and then MPT means max pooling downsample strapped 2 and notes that after we've we've done the downsampling you double the number
27:50
of filters in a convolutional layers and then finally at the ends of
27:56
blocks of convolutional maxpooling layers you can have a fully connected than older Austin is dense layers where you'll typically and if you go the large resolution coming
28:07
out of the you want to you want to work out what was of dimensionality is points and roughly maintain that tall reduce perhaps a bit in fullyconnected layer and you can uh because of 2 or 3 fully connected layers we like and finally value out of
28:24
and this notation fully connected layers of just mean
28:27
206 channels and OK case so overall as discussed previously convolutional as a of the text
28:36
features in various locations throughout the image the fully connected layers are gonna pull information together and finally produce the output
28:46
but there are also some architectures you could look at the inception that works by Google resonance Microsoft for inspiration from a guy with a somewhat smaller people been up to
28:56
I gunsight more complex topics the batch normalization it's recommender most cases it makes things better and necessary deep networks follow a larger tell you deep learning neural networks a a deep neural network is simply never roughly more than 4 layers lets us all Islamic and so
29:16
if you want to take a deep networks of more than 8 layers you 1 batch normalization of always the just 1 trained very well and the closer speed up training so that of your loss drops you cost drops faster prebook um well can take more lead from Connecticut run on you can reach their rights as well and the reason why it's good
29:37
is sometimes you think about the the magnitude of the numbers you might start out making numbers of certain magnitude your input layer that's that that market you might be increased or decreased by I 1 by the multiplying by the weights
29:52
to get to the next layer and what the people at the stafflevel layers on top of each other you can find the magnitude you you'll values either the exponential increases or explanatory to shrink toward 0 and be 1 of those bad the training of completely so batch normalization it's a standardizes by dividing by the standard deviation subtracting the mean of 3 July so you want inserted into your convolution fully connected layers off the matrix multiplication and the before in bias and the before nonlinearity so the
30:24
nice thing is lasagna with a single call does that for users to do too much surgery cellphone network but dropout is
30:34
pretty much necessary for training uh you don't use a train time that you don't use it prediction and test time into random sample through the never to see what happens it reduces also
30:44
has overfitting overfitting is a particularly horrific problem in machine learning and it's going to buy you all is gonna by you hold time machine learning at this is what you get when you
30:58
train a model on the training data is a very very good at that samples that are in your initial training set but when you are shown the example is never seen before the does dies it fails completely and so essentially what music is
31:13
particularly goes examples of picks out features of those particular training samples and fails to generalize so drop out there back this we
31:27
gotta do using a randomly choose units and layer and multiply random subset of them by 0 usually about around half of them and so you gotta keep in the magnitude of the
31:39
output the same by scaling up by a factor of 2 and then during test predict he just run is normal the dropout turned off you know apply after the fullyconnected layers normally
31:51
don't bother you can do off the convolutional as well that um the fully counted as tools and strongly we applied that's I put it aside and Sonya and show you what it actually does this is with the drop and also you see all the outputs going through little diamonds represent all drop out so we take half of them we think in terms of an EC the gray
32:12
the gray white lines while that essentially means is when during training the uh the backpropagation won't affect those weights because the drop that kills and then the next time around you do you turn off a different subset of them and furthermore and the reason it
32:30
works is it causes the net causes the units to learn more about set features rounds of learning of can adapt and develop features that a bit too specific to those those units so that's that's roughly how it sort of combats overfitting dataset
32:46
augmentation because train your networks is notoriously data hungry you want to
32:53
reduce the overfitting and you reach a larger training set and you can do that by artificially modifying existing to you adjusting training set by taking a sample modifying it somehow and adding that modified version
33:06
to the training set so images it is going take
33:11
shift over by a certain amount of down by that you're gonna rotates at that against skeletal but horizontally flipped it be careful of that 1 safe example go images of people vertically fitness upside down that you that will describe the training set so
33:28
you got when when you're doing dataset augmentation think about what you need for your
33:32
dataset and what it should output and think about what whether your transformations are a good idea I can finally dataset
33:40
states standardization 1 your networks trained more effectively when you'll dataset has a mean of 0 would
33:50
polarize amuse 0 and unit variance both the deviation of 1 so and also
33:58
with regression you wanna standardize your input data and regression you wanna standardized
34:02
output I don't remember that in regression we also quantifying the server using real valued outputs we will make sure that standardizes well I've I've I've I've personally found that means that no 1 I haven't done that so uh but when you use your network when you deploy don't forget to uh do the reverse summarization together back into this into the space you back into this of scale range that you
34:27
want to be in in the 1st place and to do that salvation straddle samples and into the right and the case of images
34:34
he is going to go through all the images and extract all the pixels and splash Mountain big people the RGB channels
34:41
suffer and you're going to compute the the and compute the mean and standard deviation in red green and blue again 0 the mean by
34:51
subtracting it from divide by the standard deviation and that's standardization OK when training goes wrong as often will um you'll notice we wanna do is as you as he train you and and you might get an idea of what a lot what the value loss functions
35:13
when it goes crazy and start heading towards near 10 to the 10 and eventually goes everything's there is going to hell
35:21
so you try you lost as the training our network so you can watch for this OK if you have error rate of
35:31
growth of a random guess like this is throwing about 3 point it's not downloading anything essentially it's learning
35:40
to predict a constant value a
35:42
lot of the time um sometimes it is it is just there is enough data for it to pick up patterns and they can also learn to predict a constant value let's say you're for instance that
35:54
you have a dataset where say you've gotten divided say 10 classes let's say class the last class only has about 0 . 5 % of the examples and 1 of the best ways this sneaky hard Italy on that will figure out where to figure out is to simply say that that that this is something never predict that last class because it's only gonna be wrong in 0 . 5 % of the cases and that's actually pretty good way of getting the last stanza of the preload value by concentrating on all the other classes in getting those right and the and the problem is is a local minimum is a local minimum of the weakening of
36:30
the local minima of the cost function and your numbers get stuck in there a lot and it will be the end of your existence in but the most often don't learn what you expect to all you want to you look at it and think I has a human and a minor the result is this and
36:48
unknown elements to pick of features in detect something quite different on so yeah looking at
36:57
the bane your existence and illustrate this is a really nice cool example that is available online and talk about how I design a computer vision problem using networks with a simple problem like handwritten digits you could just your network 1 your network will doing great
37:14
wonderful it's a more complex problems the often just not
37:18
enough and your works on not a silver bullet so please don't believe all height that surround deep learning right now and its theoretical foot fairly possible to use a single neuron complex problems if you have enough training data which often in an
37:32
incredible amount so for more complex problems you break the
37:37
problem down into smaller steps and to talk a bit about Felix allows 2nd place solution to the cow competition on defined right whales so His 1st naive solution was to
37:53
train a classifier to identify individuals so long enough couple of his website OK cool
38:02
OK so effectively these
38:06
patterns on had a well as what used to identify an individual and the challenges of pick out figure out do an image of a whale figure out which individually is and this is the kind of image you get in the training set you've got the ocean surrounding a little while as he breaches as the prospectus had over the surface you gotta figure out who is from the picture so if it's a 1st solution was to to stick that through a classifier and see what happens so we skull find out OK baseline naive approach and what he found was is that it gave no better than random chance Saudi and that is a use what's called saliency detection where try he got the use detect trick to figure out which parts of the image of influencing the network's output the most and he found out that actually the bits of the ocean reflecting how to do that
39:02
OK trifles experiment owing to imagine that I give you
39:06
this problem you a bunch of images arrive wells might say that's number 1 that's number 7 that's in the 13 box evil something given really really horrendous horrible amnesia that has completely 1 2 minds of the concept well is what the ocean is just about every human concept you have to of the tree starting out with images this 0 knowledge of told no semantic knowledge about the problem you can't even guess what it is is given images and given numbers mental from this training set to figure out what these are what do you know what's what more
39:35
how gonna make that decision is that the ocean is the whale you what part of what what part of the images actually help you make a decision we think about it From this better than you know what that's where you 1 and starting out from starting
39:50
out from 0 knowledge and that's why the initial solution didn't work very well you can't if you had a billion images of all the ground truths and well below of the marine biologists have gone in your hand hand classified a billion images of them and put an enormous amount of human effort because
40:06
then the signal will eventually come through the noise that we compactly do that in real life so and his
40:18
solution is the armaments in the
40:20
region by saliency so found out they had locked onto the wrong features so you train what's called the localizer now I've told about classifying regret classifiers
40:29
and regressors localizes what they do is they look at image and they find might might target a point of interest is over there the image and so what he did is he found he got the localiza to take take the limit of the well and found out who had is there and after that you want for classifiers the idea is you 1st gets trained in it what to look
40:54
for whale they could outcropping of an image and then just work on that piece and furthermore he turned the keypoint
41:05
the only will have a learning you again you trade keypoint funded to find the front of the head and the back of the head so you could then take the image of a whale and takes so they're all the same orientation and position and after that I haven't got some really uniformly images of
41:23
Wales a given monitor the classifier and eventually that's a train the classifier on oriented and crop well had images bottom secondplace Macao competition so I was
41:36
kind of this of nice dismissed that nice illustration of how so if you got
41:41
a really careful about how use these things alright so modern
41:47
time great OK there might have been
41:52
vector because of your choosing them signal sequencing such oxidative Georgina um and transfer learning using a
42:03
pretrained network is often a good idea the Oxford heavyduty 19 the 19 then your network it was trained on about big a million image dataset call image and the great
42:15
thing is the play of generously made the new what weights for all available under Creative Commons license Korea CC
42:21
Attribution and you can get of there you there's also a pair Python pickled version you grab well
42:31
they're very simple and effective models like consists of 3 by the convolutions max pooling and fully connected layers that's the architecture and a classified image of EEG 19 Russia you IPython nobody will do that alright
42:56
so wouldn't take an image to classify resolvable
43:00
Peacock here we let that work 0 sorry but you problems FIL the goal of the court again classified we're going to load in our pretrained network um I'm living above skip over the cared about is inhibited all but effectively and you can go through the nobody itself it's almost only get help again 1 of us been through this quite quickly so reasoning through but above all the models like um OK this is where we have to build a architecture so you can see the input layer we go all convolutional layers maxpooling is always on your appeal and will skip all this will go down find the output which is the softmax nonlinearity building it uh when drop all parameters in they keep pardon OK sorry by this is originally from our tutorial so anyway finally we show our image of a guy classified and we predict all probabilities here we notice the output is a 1 a vector of thousand probabilities and we find out the predicted classes
44:11
84th probability 98 100 % is because and you can run that yourself and you find out the work so the cool thing is
44:19
you can take the pretrained network and using itself um transfer
44:32
learning is a cool trick in this is the last 4 columns show you training and from scratch in data hungry the reason is the atomic training data and preparing a let's timeconsuming expensive what if we don't have enough training data to get good results without money to repair the repair Will the ImageNet dataset is really huge millions of images with ground truths and what if we could somehow use
44:58
it's to how what we could somehow use the image that dataset models fast data to help us with a different task for the good news is we can and the trick is
45:10
this rather than try and reuse the day so you train your net well IVG 19 we download BG 19 and you can take part of that network and retain throw away the end part it and and and stick sticks in the stuff on the and the lot of what
45:30
we want and that way you effectively to be trained as the bit that you've added and then finetune the analog over but essentially we can use you can reuse part of EEG 19 to say classify images the weren't image twodimensional and 4 classes
45:44
and the different kinds of object category the 1 mentioned in image now so you can reuse it you can reuse it for localization say wanna find location knowledge about the location of that well that may be
45:56
uh all segmentation we wanna find exact outline of the boundary to the transfer learning what we do this without a region
46:05
19 that looks like that as well as we chop off those last 3 that stuff on the left gets for so we can just time but we chop off the last 3 letters we create a new ones randomly initialized on the end then all we do is to train than ever was any all training data the only gonna learn the promising train the parameters of the new layers that you created many we train
46:38
proximal on all layers having trained initial new ones you
46:41
train you then find him the whole lot it is due to train this time of abatement parameters of all layers and this will be some better accuracy and the result is a nice shiny new network with good performance on your particular target domain this can be solved better than you could get with so uh starting out with your interest with you datasets OK so finally
47:05
some cool work in the field of might be of interest to you as I I think I mentioned this briefly ready that they visit lists and send uh the
47:14
invisibility understanding convolutional networks they decide to visualize the responses of the convolutional layers to very
47:20
simple scenes images well inside to visualize what's going on there we wanna find all we know it's picking up this is this is a good place to look for how to how to work out what your network is detecting and these guys decide to figure out the conformal neural network so decided generate images directly unrecognizable to humanize the recognized by the
47:41
network so for instance the
47:45
network has a high degree of the a high confidence that is in fact Robin it looks like man looks like a horrible noise but it thinks acidtreated but that's an armadillo but that's a peacock really got simply color then went
48:03
on to that was so hot we generate images that they may still make sense to humans that's a king penguin
48:10
that starfish kind words picking things up it's looking for texture but is not really looking for the actual structure of the object so it's picking up certain means ignoring the ignoring other quite important features and you can run your networks works in reverse you can get to generate images as well as as well as classify them so these
48:29
guys decide to make them generate chairs so the give the
48:33
orientations on the top of the parameters of the chair and the trying to generate an image
48:38
Sinhalese chairs to more from and this 1 got a lot of press no artistic style user you if you got the prison and people also about 1 more assurance they took Oxford that from extract texture
48:55
features from 1 image and apply them to the other so you take the efforts of the so
49:05
this waterfront and you take a painting like say starring of of anger and repaints the image in the style of anger on the start of the government is screen on these others is very cool and nice thing is that the iPhone apps that do this now and what these guys did is this
49:24
is this is this is a bit of a masterpiece of work they are they
49:27
generate these these images of Belgium generated by on your network and why they did it is they're trying to tune your networks want to be a master forger on there to be the detective the master folder tries to generate an image and the detective tries to tell us our real Benjamin image of a better word that 1 has been generated by the by the forger and the idea is you can adapt them to get both better so that the
49:49
master folding spectrum better match until it generates pictures but that which is comical and even to give further by
49:56
figuring out what the so by combining some of parameters
50:00
and the it if you seen some of the and results from the universe of the King minus mental only screens stuff has been done on some of the bottom word to rectangular were to the most similar things with the facial expression as well anyway have you
50:18
found helpful I have being good a given a great audience
50:23
thank you very much
50:26
thank thank you and we have about 2 and 9 minutes and the questions the it was a great dog thank you I have actually several questions from the 1st 1 uh when you are modeling a neural network how do you chew or is there a way to to was uh how many heater layers and neutrons IUD airing them because know debts and that was an issue for me when I was modelling some I'm not aware of
51:26
any particular so the rule of thumb choose how to design a network architecture was a little thumb I used to look at something with things that work for other people and and all of that so the occident architecture where you've got the I found a small convolution kernels for people from the small convolutional kernels were well a few of those letters followed by maxpooling starting on those blocks repeats I think that I think there's some people that probably try things like grid search for used to try all the public and automatically also the architecture of a given the fact for semantic image pneumonia training on can extend even in 2 weeks all hours at least on a really big GPU is that can be impractical so uh afraid to say is this rule of thumb is for us and just right out and see it for a year while about a couple exceptions to all the people done this adaptive firms I'm sorry I additional information on my 2nd question which we
52:23
saw that you guys are analyzing images and numbers and is the way that you can like make strings input and to recognize patterns in them how would you do that would you have to like transform them some call mean for text processing in so 1
52:44
I think what people tend to do that and user might want to write to convert each word into an embedding which is like a 611 vector um and then use what's called a recurrent neural network where rather than just having it goes to the output because partially through and feed back into nearly aligned and so that it is of has an idea of time and I not you of nonpayment abysmal sounds and figure and I'm lots of my comfort zone and terms of use of ICT in the look of recurrent neural networks of but yeah they didn't use the word embeddings tend to use the word embeddings to convert the words into vector this sort of cultural way of doing that is to just have turned into 1 hot representation I forgot 2000 words in new vocabulary is simply have a reason have 1 1 you have a 1 on 2 groups and the work of the vector of all zeros except 1 for the particular what that is but given the sparsity of the often causes problems which will use the embeddings and the last question
53:46
sorry uh key could you train the neural network to do like Matt like addition maybe multiplication and if you can't would it be may be faster than the usual way of that process of doing it um you can try
54:01
to do addition I think actually there some people managed to take the image that dataset we take 200 digits in an image the figures out what they are remember and then is trained to produce the some of it can work multiplication but I do know that should be above figure out again units do that so the models actually called extends certain things which is interesting and so there's certain things that is don't do very well well so things quite limited as for would be fostered where the fast and so using however the mathematical operations to do something that is a is a 1 instruction operation processes true that
54:43
thank social career really interesting and great stuff him around the images and we'll thoughts on how neural networks could be applied to text analytics because most people don't do that some text analytics firm is
55:03
also my area so I don't know but uh I would speak to Catherine Jamal she's here and she did a very very good talk uh describing the sort of the students to give a really really good insuring really really good sort of overview of what this text processing world is like and she gave quite few nonanimals new networks or some of the best models right now but it's also my expertise but she knows the stuff on the outside speak to come any other questions I
55:40
will then be the name of neural networks comes to the size of the plane dual if it's a used widely in in science of not sure and I think that
55:52
the and in the model that we use for our new networks that have been talking about here is quite different from my neurons in the brain work I think that my my very very basic layman's understanding of brain neurons is they they operate on spike rate so that they they they generally output spikes in the frequency of that is often the strength of the output of thing I don't know uh so I think that so trying to like in these 2 others so I I don't think of it that much alike I think that the way where the similarity is is that people look to how neurons in the brain up to recover the child and they said Hakim makes and models this is the what we've got something that seems to work well given of processes and systems produce very good pattern recognition so that uh as for similarities to bring beyond that controlled same or than any other questions hi how have you heard of the
56:58
self driving car using deep learning to you implement how the driving I wonder how the would update of the cost function because is a stream of we DO rather than a fixed static output I've heard about
57:14
burned from motion of the whole of doing it and at the future if you're to try and do something about that 1 of the things you could do is you could prepare a bunch of footage we say the human who's driving this call has done well as may have and crash to kill me 1 of them's history like that so the idea is that what's good and maybe if there some footage accidents that's a that's bad don't do all they do is to what will you probably want to do it is you want to say given this video however these outputs as and still like this acceleration brake this produce these decisions and so that's actually bit like a little bit like the Atari Game Playing networks works Google developed the stuff for the global of really good schools in the video games will take the input screens and they decide whether to move up down left right shoot but a similar thing where a set of deciding whether to move up down left right show you want you control the steerable accelerator and the brakes you could do it like that but given my experience of and given the father of as I mentioned we have particularly rare examples of rare situations where coffin your neighborhood as treaty not because they might make up 0 . 0 0 0 1 % your training set uh it'll never but it'll never actionable to learn anything from those list of the cost function you it'll it'll discover local minima the signals that I would not be very comfortable getting into a call that was just controlled on your network I would not want my life in the hands of the but does not behave could build the whether it would undergo the very good and the
59:07
questions hi do you ever combined neural networks of our techniques like an approximation algorithms approximation algorithms yeah like optimization techniques boasting about travel salesman problem for example of
59:31
and know it's even even tried for the unknown not were unaware that I would be surprised instructive unaware of some looks is afraid and this the going on forever sorry uh it argues that we have to figure out a way of constructing a cost function of a some measures of the solution so how long ago about symbolism I have time for
1:00:00
1 last question the you may be asking about technical question about the ads but uh when you apply dropout does the Ys expression get real recompile recognize tools the efficient need not to take account of the uh sort of of also weights or a uh the gets sequelae in all you be floating point operations and get to the GP or CPU all but there is you so that the effect the gradient I think it's a 2nd because what you do is you get to because random number generator
1:00:52
generate 0 1 and the multiple you put that multiplying expressions I think it's just selection optimizing something we quite difficult to optimize because the problem is on and for a single sample and the budgetary blocking out difference of subset of the the the units are not even sure how what actually go optimizing an efficient way because you gotta always select which user dropping elements the from that decide what you can do operations you can save you can't to go do it on the fly and can be quite tough some guessing it I would guess that doesn't it so now since
1:01:29
then on the question so and probably time chip was 1 of them talk and fj and probably as AS giants
00:00
Building
Artificial neural network
Code
Machine vision
Scientific modelling
Image processing
Artificial neural network
Code
Coma Berenices
Mass
Computational intelligence
Summation
Goodness of fit
Computer animation
Mathematics
Selforganization
Normal (geometry)
Resultant
Library (computing)
01:11
Artificial neural network
Scientific modelling
Multiplication sign
Artificial neural network
Weight
Convolution
Local Group
Computer animation
Vertex (graph theory)
Lipschitz continuity
Library (computing)
Resultant
Library (computing)
02:11
Slide rule
Computer animation
Computer file
Lecture/Conference
Code
Scientific modelling
Multiplication sign
Materialization (paranormal)
Repetition
Web browser
Logic gate
03:03
Slide rule
Slide rule
Divisor
Code
Computergenerated imagery
Virtual machine
Coma Berenices
Weight
Set (mathematics)
Virtual machine
Medical imaging
Social class
Roundness (object)
Computer animation
Profil (magazine)
Subtraction
Data type
Mutual information
Social class
04:02
Axiom of choice
Artificial neural network
Multiplication sign
Machine vision
Computergenerated imagery
Auto mechanic
Bit rate
Weight
Disk readandwrite head
Computational intelligence
Vector graphics
Medical imaging
Social class
Prediction
Error message
Computer animation
Vector space
Term (mathematics)
Fisher's exact test
Social class
04:50
Convolution
Artificial neural network
Computergenerated imagery
Expression
Artificial neural network
Compiler
Weight
Bit rate
Convolution
Compiler
Error message
Computer animation
Bit rate
Network topology
Term (mathematics)
Software
Energy level
Regular expression
Energy level
Game theory
Error message
Data structure
Spectrum (functional analysis)
Physical system
06:11
Dataflow
Slide rule
Artificial neural network
Computergenerated imagery
Artificial neural network
Coma Berenices
Bit
Medical imaging
Propagator
Social class
Computer animation
Vector space
Central processing unit
Website
Representation (politics)
Information
Descriptive statistics
07:20
Artificial neural network
Artificial neural network
Function (mathematics)
Line (geometry)
Weight
Number
Singleprecision floatingpoint format
Matrix (mathematics)
Computer animation
Vector space
Term (mathematics)
output
output
Row (database)
Social class
08:08
Slide rule
Pixel
Artificial neural network
Function (mathematics)
Parameter (computer programming)
Weight
Mereology
Medical imaging
Nonlinear system
Matrix (mathematics)
Sigmoid function
Term (mathematics)
Wellformed formula
Vector space
output
Pixel
Multiplication
Units of measurement
Rule of inference
Multiplication
Mapping
Artificial neural network
Sampling (statistics)
Planning
Parameter (computer programming)
Weight
Functional (mathematics)
Positional notation
Computer animation
Nonlinear system
Vector space
Function (mathematics)
Element (mathematics)
Matrix (mathematics)
Resultant
09:46
Set (mathematics)
Artificial neural network
Water vapor
Function (mathematics)
Morley's categoricity theorem
BackpropagationAlgorithmus
Wave packet
Measurement
Social class
Prediction
Average
Vector space
output
Category of being
Subtraction
Error message
Artificial neural network
Bit
Weight
Prediction
Functional (mathematics)
Wave packet
Maxima and minima
Category of being
Computer animation
Nonlinear system
Function (mathematics)
Cost curve
Linearization
output
10:52
Hidden surface determination
Artificial neural network
Element (mathematics)
Exponentiation
Morley's categoricity theorem
Functional (mathematics)
Likelihood function
Number
Maxima and minima
Medical imaging
Social class
Computer animation
Function (mathematics)
Vector space
Cost curve
output
Category of being
Social class
11:39
Hidden surface determination
Linear regression
Artificial neural network
Linear regression
Function (mathematics)
Functional (mathematics)
Summation
Medical imaging
Category of being
Summation
Computer animation
Function (mathematics)
Linearization
output
Dependent and independent variables
Error message
Identical particles
Subtraction
Identity management
12:35
Symbolic computation
Code
Gradient
Multiplication sign
Expression
Gradient
Parameter (computer programming)
Parameter (computer programming)
Weight
Derivation (linguistics)
Sign (mathematics)
Computer animation
Error message
Gradient descent
13:42
Multiplication sign
Gradient
Sampling (statistics)
Parameter (computer programming)
Bit rate
Parameter (computer programming)
Weight
Batch processing
Wave packet
Table (information)
Wave packet
Singleprecision floatingpoint format
Derivation (linguistics)
Sample (statistics)
Computer animation
Bit rate
Arithmetic mean
Lecture/Conference
Average
14:33
Perceptron
Gradient
Set (mathematics)
Artificial neural network
Parameter (computer programming)
Parallel port
Average
Wave packet
Power (physics)
Measurement
Architecture
Iteration
Average
Error message
Gradient descent
Units of measurement
Computer architecture
Operations research
Multiplication
Process (computing)
Validity (statistics)
Artificial neural network
Sampling (statistics)
Parallel port
Parameter (computer programming)
Set (mathematics)
Batch processing
Measurement
Wave packet
Entire function
Sample (statistics)
Error message
Computer animation
Right angle
Iteration
Central processing unit
Units of measurement
Digitizing
Gradient descent
16:07
Vulnerability (computing)
Scaling (geometry)
Numerical digit
Artificial neural network
Computergenerated imagery
Artificial neural network
Translation (relic)
Translation (relic)
Weight
Medical imaging
Invariant (mathematics)
Personal digital assistant
Approximation
Invariant (mathematics)
Digitizing
Marginal distribution
Position operator
Units of measurement
Vulnerability (computing)
16:56
Pixel
Web crawler
Computergenerated imagery
Artificial neural network
Weight
Computational intelligence
Wave packet
Summation
Medical imaging
Kernel (computing)
Operator (mathematics)
Lie group
Computerassisted translation
Pixel
Units of measurement
Hidden surface determination
Slide rule
Artificial neural network
Machine vision
Set (mathematics)
Convolution
Signal processing
Uniform resource locator
Computer animation
Universe (mathematics)
Vertex (graph theory)
output
Fiber bundle
Units of measurement
Digitizing
Bounded variation
18:10
Filter <Stochastik>
Convolution
Machine vision
Slide rule
Digital filter
Pixel
Correspondence (mathematics)
Design by contract
Insertion loss
Function (mathematics)
Mereology
Weight
Computational intelligence
Usability
Medical imaging
Lecture/Conference
Kernel (computing)
Green's function
Position operator
Product (category theory)
Variety (linguistics)
Artificial neural network
Machine vision
Neighbourhood (graph theory)
Weight
Line (geometry)
Functional (mathematics)
Convolution
Wave
Kernel (computing)
Computer animation
Vertex (graph theory)
output
Right angle
Units of measurement
Form (programming)
20:14
Convolution
Pixel
Variety (linguistics)
Orientation (vector space)
Computergenerated imagery
Range (statistics)
1 (number)
Function (mathematics)
Weight
Mereology
Medical imaging
Mathematics
Matrix (mathematics)
Kernel (computing)
Square number
Office suite
Pixel
Subtraction
Multiplication
Computer architecture
Multiplication
Artificial neural network
Memory management
Weight
Convolution
Demoscene
Sparse matrix
Kernel (computing)
Computer animation
Quicksort
Matrix (mathematics)
21:19
Point (geometry)
Machine vision
Pixel
Block (periodic table)
Computergenerated imagery
Gradient
Artificial neural network
Sampling (statistics)
Weight
Mass
Food energy
Medical imaging
Maxima and minima
Sample (statistics)
Computer animation
Graph coloring
Lecture/Conference
Computer configuration
Operator (mathematics)
Energy level
Divisor
Pixel
Multiplication
Matrix (mathematics)
22:38
Convolution
Pixel
Numerical digit
Artificial neural network
Multiplication sign
Convolution
Function (mathematics)
Convolution
Revision control
Medical imaging
Kernel (computing)
Sample (statistics)
Computer animation
Function (mathematics)
output
Quicksort
Pixel
Resultant
23:31
Pairwise comparison
Digital filter
Greatest element
Numerical digit
Artificial neural network
Set (mathematics)
Computergenerated imagery
Function (mathematics)
Wave packet
Medical imaging
Kernel (computing)
Computer animation
Iteration
Kernel (computing)
Units of measurement
24:24
Curve
Digital filter
Texture mapping
Artificial neural network
Orientation (vector space)
Cellular automaton
Computergenerated imagery
MIDI
Mereology
Kernel (computing)
Computer animation
Personal digital assistant
Kernel (computing)
Operator (mathematics)
Square number
Right angle
Lie group
Quicksort
Object (grammar)
25:19
Slide rule
Slide rule
Artificial neural network
Multiplication sign
Expression
Artificial neural network
Coma Berenices
Insertion loss
Function (mathematics)
Mereology
Functional (mathematics)
Compiler
Mathematics
Computer animation
Insertion loss
Lecture/Conference
Function (mathematics)
Energy level
Regular expression
26:41
Filter <Stochastik>
Convolution
Digital filter
Computergenerated imagery
Artificial neural network
Mereology
Number
Architecture
Arithmetic mean
Kernel (computing)
output
Computer architecture
Artificial neural network
Block (periodic table)
Sampling (statistics)
Bit
Mereology
Limit (category theory)
Convolution
Positional notation
Kernel (computing)
Computer animation
Graph coloring
output
Block (periodic table)
Resultant
27:36
Filter <Stochastik>
Point (geometry)
Digital filter
Block (periodic table)
Computergenerated imagery
Bit
Mereology
Convolution
Number
Image resolution
Maxima and minima
Number
Population density
Kernel (computing)
Computer animation
Positional notation
Arithmetic mean
output
Block (periodic table)
Sampling (music)
28:27
Convolution
Batch processing
Information
Artificial neural network
Computergenerated imagery
Artificial neural network
Resonance
Function (mathematics)
Batch processing
Architecture
Database normalization
Medical imaging
Uniform resource locator
Positional notation
Computer animation
Arithmetic mean
Lecture/Conference
Personal digital assistant
Personal digital assistant
Normal (geometry)
output
Computer architecture
29:15
Scale (map)
Batch processing
Standard deviation
Multiplication
Artificial neural network
Artificial neural network
Order of magnitude
Insertion loss
Exponential function
Weight
Batch processing
Order of magnitude
Convolution
Number
Wave packet
Database normalization
Goodness of fit
Arithmetic mean
Matrix (mathematics)
Computer animation
Lecture/Conference
Normal (geometry)
output
Right angle
30:23
Machine learning
Musical ensemble
Artificial neural network
Set (mathematics)
Multiplication sign
Scientific modelling
Time travel
Artificial neural network
Sampling (statistics)
Prediction
Surgery
System call
Wave packet
Data model
Machine learning
Sample (statistics)
Computer animation
Software testing
Software testing
Curve fitting
31:12
Multiplication
Randomization
Divisor
Sampling (statistics)
Function (mathematics)
Drop (liquid)
Order of magnitude
Wave packet
Subset
Computer animation
Term (mathematics)
Software testing
Units of measurement
Units of measurement
Rhombus
32:12
Augmented reality
Artificial neural network
Set (mathematics)
Multiplication sign
Covering space
Sampling (statistics)
Bit
Drop (liquid)
Line (geometry)
Set (mathematics)
Weight
BackpropagationAlgorithmus
Subset
Wave packet
Revision control
Subset
Latent heat
Roundness (object)
Sample (statistics)
Computer animation
Quicksort
Units of measurement
Units of measurement
33:05
Scale (map)
Standard deviation
Augmented reality
Transformation (genetics)
State of matter
Artificial neural network
Computergenerated imagery
Artificial neural network
Fitness function
Variance
Transformation (genetics)
Variance
Wave packet
Medical imaging
Arithmetic mean
Computer animation
Arithmetic mean
Lecture/Conference
Units of measurement
Units of measurement
33:57
Server (computing)
Pixel
Linear regression
Computergenerated imagery
Range (statistics)
Artificial neural network
Function (mathematics)
Inversion (music)
Medical imaging
Prediction
Flow separation
Lecture/Conference
Green's function
output
Pixel
Standard deviation
Scaling (geometry)
Spacetime
Linear regression
Artificial neural network
Sampling (statistics)
Arithmetic mean
Sample (statistics)
Computer animation
Personal digital assistant
Function (mathematics)
output
Right angle
Reverse engineering
34:50
Point (geometry)
Scale (map)
Standard deviation
Divisor
Artificial neural network
Computergenerated imagery
Insertion loss
Bit rate
Equivalence relation
Functional (mathematics)
Wave packet
Error message
Computer animation
Bit rate
Arithmetic mean
Insertion loss
Error message
35:38
Logical constant
Logical constant
Multiplication sign
Artificial neural network
Instance (computer science)
Maxima and minima
Maxima and minima
Computer animation
Insertion loss
Personal digital assistant
Pattern language
Local ring
Social class
36:28
Machine vision
Logical constant
Artificial neural network
Machine vision
Element (mathematics)
Artificial neural network
Computational intelligence
Existence
Number
Maxima and minima
Maxima and minima
Computer animation
Insertion loss
Cost curve
Local ring
Digitizing
Resultant
37:17
Complex (psychology)
Randomization
Function (mathematics)
Mereology
Rotation
Data model
Medical imaging
Duality (mathematics)
Computer cluster
Singleprecision floatingpoint format
Pattern language
Row (database)
System identification
Bounded variation
Metropolitan area network
Trail
View (database)
Complex (psychology)
Bit
Mereology
Translation (relic)
Disk readandwrite head
Hypothesis
Maxima and minima
Uniform resource name
Pattern language
Figurate number
Random number
Mapping
Computergenerated imagery
Artificial neural network
Library catalog
Wave packet
Right angle
Data Augmentation
Rule of inference
Twin prime
Online help
Artificial neural network
Distribution (mathematics)
Surface
Video tracking
Coma Berenices
Weight
Stack (abstract data type)
Ultraviolet photoelectron spectroscopy
Scherbeanspruchung
Singleprecision floatingpoint format
Computer animation
Units of measurement
Series (mathematics)
38:59
Medical imaging
Network topology
Decision theory
Cuboid
Mereology
Number
Wave packet
39:50
Point (geometry)
Rounding
Mapping
Computergenerated imagery
Artificial neural network
Medical imaging
Video game
Right angle
Quantum
Addressing mode
output
Metropolitan area network
Raw image format
Interior (topology)
Limit (category theory)
Wave packet
Disk readandwrite head
Computer animation
Uniform resource name
Function (mathematics)
Noise
Dependent and independent variables
Cuboid
Local ring
Internationalization and localization
40:52
Software engineering
Metropolitan area network
Raw image format
Greatest element
Orientation (vector space)
Point (geometry)
Computergenerated imagery
Artificial neural network
Disk readandwrite head
Wave packet
Disk readandwrite head
Wave packet
Maxima and minima
Medical imaging
Pointer (computer programming)
Computer animation
Lecture/Conference
Mathematics
Lie group
Natural number
Pattern language
Locationbased service
Position operator
41:40
Pressure
Artificial neural network
Computergenerated imagery
Artificial neural network
Heat transfer
Weight
System call
Wave packet
Disk readandwrite head
Attribute grammar
Revision control
Data model
Medical imaging
Social class
Computer animation
Lecture/Conference
42:28
Convolution
Structural load
Model theory
Scientific modelling
File format
Price index
Function (mathematics)
Parameter (computer programming)
Variable (mathematics)
Cartesian coordinate system
Data model
Summation
Medical imaging
Order (biology)
Prediction
Tensor
Network topology
Hausdorff dimension
Physical law
Row (database)
Pixel
Social class
Cliquewidth
Building
Computer file
Parameter (computer programming)
Term (mathematics)
Maxima and minima
Quantum state
Array data structure
Sample (statistics)
Vector space
Oval
Genetic programming
output
Representation (politics)
Curve fitting
Digital filter
Personal identification number
Graphics tablet
Set (mathematics)
Computergenerated imagery
Artificial neural network
Electronic mailing list
Drop (liquid)
Social class
Nonlinear system
Arithmetic mean
Lecture/Conference
Regular expression
output
Computer architecture
Artificial neural network
Dimensional analysis
Coma Berenices
Weight
Batch processing
Wave packet
Convolution
Plane (geometry)
Number
Population density
Computer animation
Nonlinear system
Function (mathematics)
Lie group
String (computer science)
Mathematics
Element (mathematics)
Determinism
Integer
Units of measurement
Form (programming)
Matrix (mathematics)
44:11
Online help
Artificial neural network
Computergenerated imagery
Artificial neural network
Weight
Heat transfer
Wave packet
Medical imaging
Goodness of fit
Computer animation
Lecture/Conference
Atomic number
Damping
Task (computing)
Resultant
44:58
Scientific modelling
Computergenerated imagery
Artificial neural network
Heat transfer
Bit
Mereology
Weight
Twodimensional space
Mereology
Weight
Medical imaging
Category of being
Social class
Uniform resource locator
Computer animation
Object (grammar)
Subtraction
Task (computing)
Local ring
Task (computing)
Social class
45:56
Domain name
Artificial neural network
Multiplication sign
Artificial neural network
1 (number)
Parameter (computer programming)
Heat transfer
Parameter (computer programming)
Computer font
Distance
Wave packet
Wave packet
Maxima and minima
Goodness of fit
Computer animation
Arithmetic mean
Lecture/Conference
Boundary value problem
output
Cuboid
Resultant
47:00
Convolution
Artificial neural network
Confidence interval
Computergenerated imagery
Artificial neural network
Confidence interval
Coma Berenices
Instance (computer science)
Field (computer science)
Convolution
Demoscene
Degree (graph theory)
Medical imaging
Prediction
Computer animation
Field (mathematics)
Dependent and independent variables
Noise
Dependent and independent variables
Metropolitan area network
48:00
Convolution
Algorithm
Texture mapping
Gradient
Orientation (vector space)
Computergenerated imagery
Artificial neural network
Parameter (computer programming)
Digital photography
Medical imaging
Prediction
Iteration
Data structure
output
Reverse engineering
Gradient descent
Texture mapping
Artificial neural network
Prisoner's dilemma
Confidence interval
Parameter (computer programming)
Weight
Word
Arithmetic mean
Computer animation
Function (mathematics)
Orientation (vector space)
Object (grammar)
Reverse engineering
48:55
Medical imaging
Word
Touchscreen
Computer animation
Algorithm
Lecture/Conference
Artificial neural network
Real number
Computergenerated imagery
Bit
49:48
Metropolitan area network
Hidden surface determination
Touchscreen
Artificial neural network
Scientific modelling
Computergenerated imagery
Expression
Parameter (computer programming)
Protein folding
Word
Computer animation
Universe (mathematics)
Matching (graph theory)
Spectrum (functional analysis)
Resultant
51:25
Process (computing)
Information
Artificial neural network
Block (periodic table)
Semantics (computer science)
Rule of inference
Convolution
Wave packet
Number
Medical imaging
String (computer science)
output
Pattern language
Thumbnail
Computer architecture
Exception handling
52:41
Time zone
Addition
Process (computing)
Key (cryptography)
Artificial neural network
Multiplication sign
Embedding
Function (mathematics)
Local Group
Recurrence relation
Word
Sparse matrix
Vector space
Term (mathematics)
Representation (politics)
Quicksort
Figurate number
54:00
Medical imaging
Addition
Multiplication
Artificial neural network
Operator (mathematics)
Scientific modelling
Analytic set
Figurate number
Digitizing
Units of measurement
54:59
Area
Goodness of fit
Process (computing)
Plane (geometry)
Artificial neural network
Scientific modelling
Duality (mathematics)
Quicksort
Student's ttest
55:48
Frequency
Fluid statics
Pattern recognition
Goodness of fit
Process (computing)
Artificial neural network
Scientific modelling
Cost curve
Similarity (geometry)
Function (mathematics)
Streaming media
Physical system
57:13
Touchscreen
Artificial neural network
Decision theory
Neighbourhood (graph theory)
Electronic mailing list
Bit
Set (mathematics)
Travelling salesman problem
System call
Approximation
Wave packet
Maxima and minima
Goodness of fit
Shooting method
Video game
Approximation algorithm
Cost curve
Videoconferencing
Video game
output
Game theory
Mathematical optimization
59:30
Point (geometry)
Random number generation
Real number
Gradient
Expression
Sound effect
Weight
Measurement
Symbol table
Automatic differentiation
Compiler
Operator (mathematics)
Cost curve
Central processing unit
1:00:51
Operator (mathematics)
Multiplication sign
Expression
Element (mathematics)
Sampling (statistics)
Selectivity (electronic)
Subtraction
Units of measurement
Subset
1:01:43
Computer animation