Merken

An Introduction to Deep Learning

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
I could there's an introduction to deep learning so bispectrum jet it commodity than succumbing
Canada let's start by thanking Interreg Rashid the going is excellent gentle introduction to no networks on and build upon that and hopefully to hopefully here show you how to um develop some of the networks have been used to get to the really good computer vision results that we've seen recently so of focuses mainly gonna be on
image processing this morning and this talk is come although principles and mass behind it in the code and the reason is is quite a big topic is colored go through and NGOs screws and an outside the littlest code that had a this useful so as a quick overview organ goes through we are going discusses the honor library which is the 1 I personally use all of also love that tends to flow and we're gonna cover the basic model of what
is in our network just building on reeks told they're are going to have a car go through convolutional networks and these similar networks that get in a really really good results that we've seen recently the model briefly lasagna which is another Python library built
into the honor to make it easier to build new on x will discuss why is there and what it does and then I'll give you a few hints about how to actually build on your network you now have she structure what led us to choose um just to say you have a rough idea how to train them the of just a few contends and tips to practically get going and finally uh time permitting i'll go through through the Oxford net feature gene
network which is how to use a pre-trained that was the you can download on the Creative Commons from node to the University of use that yourself because all go through why it's useful sometimes users and there whether somebody else's trained for you and tweak if own purposes now the
nice thing is there's some 12 materials this is the based
of the tutorial I get up I data London in May and if you check out the gate there right fury deep learning uh tutorial data 2016 you'll find that there's they get high rep either all my works of you belong get help so you should go to see you everything there in your browser I would ask the the please please please do not try and run
this code during the talk and the reason is is because when you run the stuff that uses the VGA Oxford models that will need to download a 500 makeweights file and you will kill Wi-Fi fuel start doing so please it on your own time such big X and also uh yeah
this if you want to get more in depth about the Amazon here up up some slides we
check that 1 speaker dep profile will be the to the B. This talk slides and will also be intuitively on a lasagna as well so that will give you a break down of Python code using the animals idea what it does and how to use it um and furthermore if you don't have a machine available you don't wanna set of itself there are set up and uh MI for you so if you wanna go use 1 of their
GP use you can go and grab a hold of that and the Ronald code there everything's all set up and I hope it's relatively easy to get into alright now to get into the
meat of the talk and what better place to start than image imagine that there is an academic image classification datasets we got round million images I think it might be even more now the divide into a thousand different classes so you got various different types of Belarus different poses of at flowers buckets whatever else swallowing came up with rocks males and the
way the ground truths as in you got a bunch of images that means greater flicker and you got of what a ground truth for each image is and while was
prepared as they went and got some people to do over terms in Mechanical Turk now the top 5
challenge we've got do is you go to produce a classifier that when a given an image will produce a a probability score
what it thinks it is and you score heads if the ground truth class therefore true class is someone within your within your molecule or um in the know network or whatever is used its top 5 choices for what it thinks the images and in
2012 the best approaches of the time used a lot of hand-crafted features for those if you move computer vision is quite SIFT hogs Fisher vectors on my
stick into classifier maybe even a classifier on the top-5 error rate was around 25 % and then the game changed because of the
system and hence and in that paper envision that classification with deep convolutional neural networks that a mouthful they manage to get error down to 15 % and the last 3 years more modern network architectures the goddamn further now a dance about 5 to 7 % and I think people a Microsoft even got down to 3 or 4 I hope that this talk is going to give an idea of how it's done
OK let's have a quick clicker run over the unknown no networks offer
comes in 2 flavors was kind on a spectrum really you got the kind your never toolkits the quot high level 1 and the other and you go expression compilers and and you nematode please specify the new network in terms of layers with expression compilers this somewhat lower level and can describe the mathematical expressions that tree recovered that affect behind the layers that effectively describe the network and some more powerful and flexible approach the annotation expression compiler you can run NumPy style expressions I going to compile it to be the same
to on your CPU all CUDA 2009 NVIDIA GPU if you have 1 of those available and I once again
if you wanna go to have again and again to that there must slides so that I mentioned earlier there's a lot more there's a lot more to the other so go check out the deep learning . net website to learn more about it and find out about that like is a full description the API and everything will
do uh some of which you may want to use there are of course others there's
tends flow developed by Google Maps gaining popularity really fast these days of them I will be the future will see OK what is a
neural network well we're gonna cover a fair bit about 3 covered in the previous talk but it's got multiple layers and the data propagates through each layer and is transformed as it goes through so we must all travel the image of a bunch of bananas is going to go through the 1st hidden layer and get transformed into a different representation only a transformed against the next hidden layer and finally we end up with a similar we're doing image classifier we end up with a probability vector effectively all values in that middle
son to 1 and all predicted so uh out pretty into class is the corresponding row in the probability that and the probability that Robert has the
highest probability OK and this is 1 that a kind of looks like we see there are weights that you saw in the previous talk uh that connect all of potential use between the layers and see data being put in on the input propagating through an arriving in the output breaking down a single layer of a neural network we've got our input which is basically it is a vector of an array of numbers multiplied by a weight matrix which is the crazy lines and then we have a bias term which is simply an offset use out of vectors and then we have our
activation function or nonlinearity those terms are roughly interchangeable and that's that's the upper layer
activations will then goes into the next slide all the output of this the last letter in the network mathematically speaking x sample vector what is the output we represent all weights by about by their weights matrix that's 1 of the parameters of our network all parameters the bias got a molarity function and normally these days that's really rectified linear units it's about as simple as they come it simply maps of x is 0 that is the most that's the part that's the activation function has become the most popular recently In a Nutshell y equals f w x + b repeated for each layer as a goes through and that's basically in your network just that same formula repeated over and over once for each letter and so would make a misclassified would take the pixels from our image when sperm onto a vector stretch them out row-by-row bonds the network and get a result so in summary no no work is built from layers each of which is and matrix multiplication then and bias the plan nonlinearity OK and how to train a neural network we've got to learn values for all parameters the weights and the biases for every letter and for
that we use backpropagation we're gonna initialize always
randomly that with more on this later we're going initialize the biases alters 0 and then for each example our training set you want to value has to make said we got to evaluate on on average prediction seawater reckons the output is it compared to the other the actual training the actual training out what it should produce given the input we got to measure of cost function which
is roughly speaking the error that's the difference between water network is predicting what should
predict the ground-truth output no the cost function is kind of important so we'll just go and discuss how little bit the classification where the idea is given an input and a bunch of categories which category best describes this input of final we use a function called softmax as or nonlinearity are nonlinear linearity or activation function and outputs a vector of
class probabilities the best way of thinking about it is that say I've got a bunch of numbers must some mall up and I divide each element by the sun value roughly the proportion or probability assuming all of our numbers to start with a positive but they
can also go negative your network so what we do with the softmax is at 1 little wrinkle the full summing up of what we do is we take our input numbers we compute the exponent of the mall namely someone not we divide the exponent by and the
exponents that's softmax and cost function function is negative log likelihood also categorical cross-entropy 0 and to do that you gotta take the log of u if you to say you have an image of a dog you
take you around the image to the network you see what the predictive probability is the dog you take the log of the probability which is going to be negative if it's break the probabilities was then a lot of that's going 0 if it's like 0 . 1 is can be quite strongly negative indicate that seem again that's log and so the idea is that if if it's supposed to output dog it should give a probability of 1 if it's giving a probability of less than that the log the negative log of the quite positive which indicates high error so that's your cost the
regression is is different but rather than classifying an input and saying which category closely matches this try quantify the measuring strength of something will strength some response typically
without a final and doesn't have an activation function is just the identity linear annual cost is going to be sum of squared differences then what we're
gotta do with unknown I would could reduce the cost reduced the error using gradient descent more we have to do as we have to compute the derivative of the gradient of the cost with respect to all parameters which is all always smaller biases within our letters the cool thing about it is the iid does the symbolic differentiation for you I can tell you right now that you don't wanna be in a situation where you have this massive expression for you on your network and you've got to go and compute the derivative of the cost to the spent some parameter by hand because you will make a mistake you will flip a minus sign somewhere and then you'll never want learn and debugging will be a goddamn nightmare because we really help to figure out where it's gone wrong yeah I didn't I would recommend getting any m a symbolic and symbolic mathematical package for you will use some of the on is handles at all and actually write that code there
because by the way this is the undergrad course weight and mother told his do this as well just to save you time and sanity
and many of their your parameters you take your weights and you subtracts the learning rates which is lambda and multiply by the gradient and and i'd Stanley learning rate should be somewhere in the region of 1 1 times 10 to the minus 4 1 times 10 to the minus 2 something in that region you also going you typically don't train 1 example that once you got tables where mini-batch about 100 samples through dataset you're going to compute the costs the
each of those samples Abergel cost together and then compute the derivative average cost with respect to all of your parameters um and the ideas you know with average in that so the idea is that that means that you get about 100 samples
persistent parallel on them and we run on a GPU that tends to speed things up a lot because it uses all of the the parallel processing power of GPU training on all the examples in your entire training set is called an epoch and of want multiple epochs of training never some not or 300 so In summary take a mini-batch training samples evaluate running through the network measure the average error error called cost across the mini-batch and use gradient descent to modify the parameters to reduce cost and
the baby above and all right
multilayer perceptron this is the simplest neuron network architecture there's nothing we haven't seen so far he is only 1 and is fully connected all dense layers in a densely each unit is connected to every single units in the previous layer and to carry on from To can pick up from 3 to talk and then missed the analyst handwritten digits dataset is a rose is a good good place to start and on their work with 2 hidden layers but the 256 units after 300 iterations gets about 1 . 8 3 % validation Sarah so a is about 98 . 1 7 % accuracy which is pretty good
however these hundred digits acquires special case all
digits nicely center when the image there or for the same position scale to about the same size and you can see that the examples that and are fully connected networks have 1 weakness has no translational invariance if margin you wanna like take emission detectors go detector bowl somewhere in the image uh um what effectively means is you know it's let me learn to pick it up pick up the ball in the position where it's been seen so far it won't learn to generalize across all positions in the image and 1 of the cool things we can do is if we take the the weights that we learn and we say take we take and which 1 of the new 1 of the units in the 1st hidden layer
and take the strengths of the weights that link to all pixels in the input layer and this that's what you end up with so you see that your 1st hidden layer weights offered to for bundle feature detectors that detect pick up the various strokes that make up the digits and so is kind of cool too busy lies above that shows you how the dense layers are translationally dependent and and so for
general imagery like say contact cats dogs various spiders and everything that makes up versatile creatures mortgage things you gotta have a train set large enough to have every single possible feature in every single location of all the images and you will have a network that's got enough units to represent all this variation um the Casey again to have a training set and the trillions and your network with billions and billions of nodes and you don't have enough again the enough about all the computers in the world and the death of the universe more training so material convolutional networks is how we address that convolution it's a fairly common operation in computer vision and signal processing because a slut a common convolutional kernel
over the image and what you do is imagine say the image pixels on 1 they don't tell your kernel which is got a bunch of a bunch of little weights sponsible values multiply the value in the kernel by the pixel underneath it the Frawley for all the for all values in the kernel you know take those those but those um parts and some all up in the slide the kernel 1 position to the same slide this do you want to do the
same and what you end up with is and outputs and
while often used for feature detection so brief dates a ball filters if we produce these bunch of the these these filters which product over the sine wave and the uh gas function in that what is the loss of usable way visible soft circular way things and if you do convolutional see that they act as a feature detector that detects certain features in the image so you can see are roughly corresponds you can see the 1 of the vertical bars there roughly pick out the vertical lines in the image of the anonymous the horizontal bars pick out the horizontal lines and you can see how convolution convolutions act as a feature detector and that was used pretty much they is quite a lot for that too bad contract to convolutional networks bank public and forget recapped that's a fully connected that looks like with all of our input connected to all of our outputs In a convolutional layer you'll notice of a note on the right is only connected to a small neighborhood of nodes on the left and the next node down as catch to small corresponding neighborhood the weights also shared so that means use the same value for all the red weights and for greens and for all the others and the values of these weights for that kernel that features of a practical computer vision worthy of these kernels manly learning like in a
convolutional network more than 1 kernel has to be used because you got extract a variety of features is not office not sufficient just to build a gentle of just the horizontal edges you want at vertical ones and all the other various orientations and size as well so you have a range of kernels so you're going have
different weight kernels and the output the ideas he got image that 1 channel only input and about 3 channels on the output or you might find in the tube you convolutional memory manager of 48 channels all 256 face examples later some architectures in the inner part of the scene for a high dimensionality in this sort of the channels up of and OK stage calculates all pixels image or text to the pixels and all channels in the previous lesson draws in the doors dates from all channels in the previous layer however the maths still the same and the reason is because a convolution can be expressed as a multiplication by weight matrix is just the way major square sparse but we know the
mass doesn't really change as far as uh conceptually and I was fortunate for us because it means that the gradients and having done so far this still works and as for how how you go about figuring that out I just recommend lengthy energy for you that wouldn't be
myself when recommended there's 1 more thing we need downsampling so typically if you don't if you've worked in further Balkan point
is an image-editing factors on shrink an image down by certain amount say about 50 %
um you watching resolution and for that we use to operation the the max pooling of striving max
pooling what you can do is you can see G. level image up there is divided into 4 color blocks say the blue block has 4 pixels what we do is we take this for pixels we picked on the maximum value we use that's Robin averaging just take the maximum and that's max pooling and the downsamples the image by by the fact that if p is thus of size of the pooling um that operates on the channel independently the other option is striding and what we do there is effectively pick a sample skip a few pick sample skip a few it's it's even simpler this is
often quite a lot faster because what you can do so the the convolution operations support Strider convolutions were rather know rather than
taking sort of them in the output inferring some way that is to effectively jump over by the pixel each time so that's faster and you get similar results so moving on yeah the
current used convolutional networks to currently used to solve the image the and this status in 1995 and this is a simplified version of its architecture so we've got is you've got to 20 kernels you got was 20 between input image 1 channel this is monochrome we got 20 kernels 5 by 5 so they would use the Mr. 24 with 24 that's not 20 channels max-pool shrink by half have 50
kernels 5 by 5 now
go 50 channel image a bait med school shrink it by half we found mentioned uh do a fully connected densely 2 2 5 6 units and finally fully connected to all 10 unit output layer from from past cruelties after
Federation of the training set we get 99 21 % accuracy 1 7 months
and there is not too bad what about learned kernels this interesting to think about what the feature detectors is picking up so if we look at say a big dataset emission at this is that this is the preserve the paper I mentioned by the beginning these are the kernels they learned by the
neural network and for comparison a considerable over there no the reason the columns of the bottom is just because of the way they did the actual thing involving 2 ways but
up but if you look at the top right you can
see how it's picked up always versatile edge detectors of various sizes and orientations that's the 1st letter uh uh either Fergus to further and they figured out a way of busy lies in the kernels how they respond to the 2nd so you can see the book cells that respond to various of sort more complex features then squares and you kind of the this curve the texture Wilson online
features of circular features and further up on about layer 3 you get somewhat more complex features
still we've got things that repres River recognize parts of simple parts of objects the case this gives you an idea of roughly how the convolutional networks fit together uh the operators of the feature detectors where each letter builds on the previous 1 picking up ever more complex features OK now
move on to lasagna if you are
specifying that what using the mathematical expressions using the and it's really powerful was quite low level if you have to write
I on your network mathematical expressions and on the expression each time it could get a painful but was only built on
top of it and makes it nicer to build networks using the
and its API rather than slide to specify mathematically expressions you can construct layers of the network and the Council and get the expressions for the output for its output loss this Python layer on top of the utterances with them so in the end but the cool thing about it is if you have 1 of these mathematical expression compiler if you are constant some crazy new loss function will do something new and crazy whatever years you like all you wanna beats of inventive you can just go right of maps and let the honor take care of figuring out how to run that using CUDA using uh nvidia's CUDA and what part worry about itself is quite easy to get going as old as you just do all in Python Annals what's great so that's that's what happened to like it um from once again uh slides available there you want to go go and dive in more detail as for how to build and train neural
networks having starts out with
a bit about the architecture if you want to know what I want to use it to get a nice new I was gonna work going to try and give you some rough ideas of all colors you wanna use and where the nose to get some this guy is going to give you good results so you're a part of the
network is going to be
stuff your input layer is going to be blocks going to consist of some number of convolutional layers to 3 for convolutional layers followed by max-pooling where that effectively down
samples alternatively you can also use starting as well and we have
another block the same and you'll notice that limitation is that's quite common never in the academic literature is you specify the number of filters the number of the number of the number of kernels and then the 3 specifies the the size they often used quite small
filters in the 3 by 3 kernels and then MPT means max pooling downsample strapped 2 and notes that after we've we've done the downsampling you double the number
of filters in a convolutional layers and then finally at the ends of
blocks of convolutional max-pooling layers you can have a fully connected than older Austin is dense layers where you'll typically and if you go the large resolution coming
out of the you want to you want to work out what was of dimensionality is points and roughly maintain that tall reduce perhaps a bit in fully-connected layer and you can uh because of 2 or 3 fully connected layers we like and finally value out of
and this notation fully connected layers of just mean
206 channels and OK case so overall as discussed previously convolutional as a of the text
features in various locations throughout the image the fully connected layers are gonna pull information together and finally produce the output
but there are also some architectures you could look at the inception that works by Google resonance Microsoft for inspiration from a guy with a somewhat smaller people been up to
I gunsight more complex topics the batch normalization it's recommender most cases it makes things better and necessary deep networks follow a larger tell you deep learning neural networks a a deep neural network is simply never roughly more than 4 layers lets us all Islamic and so
if you want to take a deep networks of more than 8 layers you 1 batch normalization of always the just 1 trained very well and the closer speed up training so that of your loss drops you cost drops faster prebook um well can take more lead from Connecticut run on you can reach their rights as well and the reason why it's good
is sometimes you think about the the magnitude of the numbers you might start out making numbers of certain magnitude your input layer that's that that market you might be increased or decreased by I 1 by the multiplying by the weights
to get to the next layer and what the people at the staff-level layers on top of each other you can find the magnitude you you'll values either the exponential increases or explanatory to shrink toward 0 and be 1 of those bad the training of completely so batch normalization it's a standardizes by dividing by the standard deviation subtracting the mean of 3 July so you want inserted into your convolution fully connected layers off the matrix multiplication and the before in bias and the before nonlinearity so the
nice thing is lasagna with a single call does that for users to do too much surgery cellphone network but dropout is
pretty much necessary for training uh you don't use a train time that you don't use it prediction and test time into random sample through the never to see what happens it reduces also
has over-fitting overfitting is a particularly horrific problem in machine learning and it's going to buy you all is gonna by you hold time machine learning at this is what you get when you
train a model on the training data is a very very good at that samples that are in your initial training set but when you are shown the example is never seen before the does dies it fails completely and so essentially what music is
particularly goes examples of picks out features of those particular training samples and fails to generalize so drop out there back this we
gotta do using a randomly choose units and layer and multiply random subset of them by 0 usually about around half of them and so you gotta keep in the magnitude of the
output the same by scaling up by a factor of 2 and then during test predict he just run is normal the dropout turned off you know apply after the fully-connected layers normally
don't bother you can do off the convolutional as well that um the fully counted as tools and strongly we applied that's I put it aside and Sonya and show you what it actually does this is with the drop and also you see all the outputs going through little diamonds represent all drop out so we take half of them we think in terms of an EC the gray
the gray white lines while that essentially means is when during training the uh the back-propagation won't affect those weights because the drop that kills and then the next time around you do you turn off a different subset of them and furthermore and the reason it
works is it causes the net causes the units to learn more about set features rounds of learning of can adapt and develop features that a bit too specific to those those units so that's that's roughly how it sort of combats overfitting dataset
augmentation because train your networks is notoriously data hungry you want to
reduce the overfitting and you reach a larger training set and you can do that by artificially modifying existing to you adjusting training set by taking a sample modifying it somehow and adding that modified version
to the training set so images it is going take
shift over by a certain amount of down by that you're gonna rotates at that against skeletal but horizontally flipped it be careful of that 1 safe example go images of people vertically fitness upside down that you that will describe the training set so
you got when when you're doing dataset augmentation think about what you need for your
dataset and what it should output and think about what whether your transformations are a good idea I can finally dataset
states standardization 1 your networks trained more effectively when you'll dataset has a mean of 0 would
polarize amuse 0 and unit variance both the deviation of 1 so and also
with regression you wanna standardize your input data and regression you wanna standardized
output I don't remember that in regression we also quantifying the server using real valued outputs we will make sure that standardizes well I've I've I've I've personally found that means that no 1 I haven't done that so uh but when you use your network when you deploy don't forget to uh do the reverse summarization together back into this into the space you back into this of scale range that you
want to be in in the 1st place and to do that salvation straddle samples and into the right and the case of images
he is going to go through all the images and extract all the pixels and splash Mountain big people the RGB channels
suffer and you're going to compute the the and compute the mean and standard deviation in red green and blue again 0 the mean by
subtracting it from divide by the standard deviation and that's standardization OK when training goes wrong as often will um you'll notice we wanna do is as you as he train you and and you might get an idea of what a lot what the value loss functions
when it goes crazy and start heading towards near 10 to the 10 and eventually goes everything's there is going to hell
so you try you lost as the training our network so you can watch for this OK if you have error rate of
growth of a random guess like this is throwing about 3 point it's not downloading anything essentially it's learning
to predict a constant value a
lot of the time um sometimes it is it is just there is enough data for it to pick up patterns and they can also learn to predict a constant value let's say you're for instance that
you have a dataset where say you've gotten divided say 10 classes let's say class the last class only has about 0 . 5 % of the examples and 1 of the best ways this sneaky hard Italy on that will figure out where to figure out is to simply say that that that this is something never predict that last class because it's only gonna be wrong in 0 . 5 % of the cases and that's actually pretty good way of getting the last stanza of the preload value by concentrating on all the other classes in getting those right and the and the problem is is a local minimum is a local minimum of the weakening of
the local minima of the cost function and your numbers get stuck in there a lot and it will be the end of your existence in but the most often don't learn what you expect to all you want to you look at it and think I has a human and a minor the result is this and
unknown elements to pick of features in detect something quite different on so yeah looking at
the bane your existence and illustrate this is a really nice cool example that is available online and talk about how I design a computer vision problem using networks with a simple problem like handwritten digits you could just your network 1 your network will doing great
wonderful it's a more complex problems the often just not
enough and your works on not a silver bullet so please don't believe all height that surround deep learning right now and its theoretical foot fairly possible to use a single neuron complex problems if you have enough training data which often in an
incredible amount so for more complex problems you break the
problem down into smaller steps and to talk a bit about Felix allows 2nd place solution to the cow competition on defined right whales so His 1st naive solution was to
train a classifier to identify individuals so long enough couple of his website OK cool
OK so effectively these
patterns on had a well as what used to identify an individual and the challenges of pick out figure out do an image of a whale figure out which individually is and this is the kind of image you get in the training set you've got the ocean surrounding a little while as he breaches as the prospectus had over the surface you gotta figure out who is from the picture so if it's a 1st solution was to to stick that through a classifier and see what happens so we skull find out OK baseline naive approach and what he found was is that it gave no better than random chance Saudi and that is a use what's called saliency detection where try he got the use detect trick to figure out which parts of the image of influencing the network's output the most and he found out that actually the bits of the ocean reflecting how to do that
OK trifles experiment owing to imagine that I give you
this problem you a bunch of images arrive wells might say that's number 1 that's number 7 that's in the 13 box evil something given really really horrendous horrible amnesia that has completely 1 2 minds of the concept well is what the ocean is just about every human concept you have to of the tree starting out with images this 0 knowledge of told no semantic knowledge about the problem you can't even guess what it is is given images and given numbers mental from this training set to figure out what these are what do you know what's what more
how gonna make that decision is that the ocean is the whale you what part of what what part of the images actually help you make a decision we think about it From this better than you know what that's where you 1 and starting out from starting
out from 0 knowledge and that's why the initial solution didn't work very well you can't if you had a billion images of all the ground truths and well below of the marine biologists have gone in your hand hand classified a billion images of them and put an enormous amount of human effort because
then the signal will eventually come through the noise that we compactly do that in real life so and his
solution is the armaments in the
region by saliency so found out they had locked onto the wrong features so you train what's called the localizer now I've told about classifying regret classifiers
and regressors localizes what they do is they look at image and they find might might target a point of interest is over there the image and so what he did is he found he got the localiza to take take the limit of the well and found out who had is there and after that you want for classifiers the idea is you 1st gets trained in it what to look
for whale they could outcropping of an image and then just work on that piece and furthermore he turned the keypoint
the only will have a learning you again you trade keypoint funded to find the front of the head and the back of the head so you could then take the image of a whale and takes so they're all the same orientation and position and after that I haven't got some really uniformly images of
Wales a given monitor the classifier and eventually that's a train the classifier on oriented and crop well had images bottom second-place Macao competition so I was
kind of this of nice dismissed that nice illustration of how so if you got
a really careful about how use these things alright so modern
time great OK there might have been
vector because of your choosing them signal sequencing such oxidative Georgina um and transfer learning using a
pre-trained network is often a good idea the Oxford heavy-duty 19 the 19 then your network it was trained on about big a million image dataset call image and the great
thing is the play of generously made the new what weights for all available under Creative Commons license Korea CC
Attribution and you can get of there you there's also a pair Python pickled version you grab well
they're very simple and effective models like consists of 3 by the convolutions max pooling and fully connected layers that's the architecture and a classified image of EEG 19 Russia you IPython nobody will do that alright
so wouldn't take an image to classify resolvable
Peacock here we let that work 0 sorry but you problems FIL the goal of the court again classified we're going to load in our pre-trained network um I'm living above skip over the cared about is inhibited all but effectively and you can go through the nobody itself it's almost only get help again 1 of us been through this quite quickly so reasoning through but above all the models like um OK this is where we have to build a architecture so you can see the input layer we go all convolutional layers max-pooling is always on your appeal and will skip all this will go down find the output which is the softmax nonlinearity building it uh when drop all parameters in they keep pardon OK sorry by this is originally from our tutorial so anyway finally we show our image of a guy classified and we predict all probabilities here we notice the output is a 1 a vector of thousand probabilities and we find out the predicted classes
84th probability 98 100 % is because and you can run that yourself and you find out the work so the cool thing is
you can take the pre-trained network and using itself um transfer
learning is a cool trick in this is the last 4 columns show you training and from scratch in data hungry the reason is the atomic training data and preparing a let's time-consuming expensive what if we don't have enough training data to get good results without money to repair the repair Will the ImageNet dataset is really huge millions of images with ground truths and what if we could somehow use
it's to how what we could somehow use the image that dataset models fast data to help us with a different task for the good news is we can and the trick is
this rather than try and reuse the day so you train your net well IVG 19 we download BG 19 and you can take part of that network and retain throw away the end part it and and and stick sticks in the stuff on the and the lot of what
we want and that way you effectively to be trained as the bit that you've added and then fine-tune the analog over but essentially we can use you can reuse part of EEG 19 to say classify images the weren't image two-dimensional and 4 classes
and the different kinds of object category the 1 mentioned in image now so you can reuse it you can reuse it for localization say wanna find location knowledge about the location of that well that may be
uh all segmentation we wanna find exact outline of the boundary to the transfer learning what we do this without a region
19 that looks like that as well as we chop off those last 3 that stuff on the left gets for so we can just time but we chop off the last 3 letters we create a new ones randomly initialized on the end then all we do is to train than ever was any all training data the only gonna learn the promising train the parameters of the new layers that you created many we train
proximal on all layers having trained initial new ones you
train you then find him the whole lot it is due to train this time of abatement parameters of all layers and this will be some better accuracy and the result is a nice shiny new network with good performance on your particular target domain this can be solved better than you could get with so uh starting out with your interest with you datasets OK so finally
some cool work in the field of might be of interest to you as I I think I mentioned this briefly ready that they visit lists and send uh the
invisibility understanding convolutional networks they decide to visualize the responses of the convolutional layers to very
simple scenes images well inside to visualize what's going on there we wanna find all we know it's picking up this is this is a good place to look for how to how to work out what your network is detecting and these guys decide to figure out the conformal neural network so decided generate images directly unrecognizable to humanize the recognized by the
network so for instance the
network has a high degree of the a high confidence that is in fact Robin it looks like man looks like a horrible noise but it thinks acid-treated but that's an armadillo but that's a peacock really got simply color then went
on to that was so hot we generate images that they may still make sense to humans that's a king penguin
that starfish kind words picking things up it's looking for texture but is not really looking for the actual structure of the object so it's picking up certain means ignoring the ignoring other quite important features and you can run your networks works in reverse you can get to generate images as well as as well as classify them so these
guys decide to make them generate chairs so the give the
orientations on the top of the parameters of the chair and the trying to generate an image
Sinhalese chairs to more from and this 1 got a lot of press no artistic style user you if you got the prison and people also about 1 more assurance they took Oxford that from extract texture
features from 1 image and apply them to the other so you take the efforts of the so
this waterfront and you take a painting like say starring of of anger and repaints the image in the style of anger on the start of the government is screen on these others is very cool and nice thing is that the iPhone apps that do this now and what these guys did is this
is this is this is a bit of a masterpiece of work they are they
generate these these images of Belgium generated by on your network and why they did it is they're trying to tune your networks want to be a master forger on there to be the detective the master folder tries to generate an image and the detective tries to tell us our real Benjamin image of a better word that 1 has been generated by the by the forger and the idea is you can adapt them to get both better so that the
master folding spectrum better match until it generates pictures but that which is comical and even to give further by
figuring out what the so by combining some of parameters
and the it if you seen some of the and results from the universe of the King minus mental only screens stuff has been done on some of the bottom word to rectangular were to the most similar things with the facial expression as well anyway have you
found helpful I have being good a given a great audience
thank you very much
thank thank you and we have about 2 and 9 minutes and the questions the it was a great dog thank you I have actually several questions from the 1st 1 uh when you are modeling a neural network how do you chew or is there a way to to was uh how many heater layers and neutrons IUD airing them because know debts and that was an issue for me when I was modelling some I'm not aware of
any particular so the rule of thumb choose how to design a network architecture was a little thumb I used to look at something with things that work for other people and and all of that so the occident architecture where you've got the I found a small convolution kernels for people from the small convolutional kernels were well a few of those letters followed by max-pooling starting on those blocks repeats I think that I think there's some people that probably try things like grid search for used to try all the public and automatically also the architecture of a given the fact for semantic image pneumonia training on can extend even in 2 weeks all hours at least on a really big GPU is that can be impractical so uh afraid to say is this rule of thumb is for us and just right out and see it for a year while about a couple exceptions to all the people done this adaptive firms I'm sorry I additional information on my 2nd question which we
saw that you guys are analyzing images and numbers and is the way that you can like make strings input and to recognize patterns in them how would you do that would you have to like transform them some call mean for text processing in so 1
I think what people tend to do that and user might want to write to convert each word into an embedding which is like a 611 vector um and then use what's called a recurrent neural network where rather than just having it goes to the output because partially through and feed back into nearly aligned and so that it is of has an idea of time and I not you of non-payment abysmal sounds and figure and I'm lots of my comfort zone and terms of use of ICT in the look of recurrent neural networks of but yeah they didn't use the word embeddings tend to use the word embeddings to convert the words into vector this sort of cultural way of doing that is to just have turned into 1 hot representation I forgot 2000 words in new vocabulary is simply have a reason have 1 1 you have a 1 on 2 groups and the work of the vector of all zeros except 1 for the particular what that is but given the sparsity of the often causes problems which will use the embeddings and the last question
sorry uh key could you train the neural network to do like Matt like addition maybe multiplication and if you can't would it be may be faster than the usual way of that process of doing it um you can try
to do addition I think actually there some people managed to take the image that dataset we take 200 digits in an image the figures out what they are remember and then is trained to produce the some of it can work multiplication but I do know that should be above figure out again units do that so the models actually called extends certain things which is interesting and so there's certain things that is don't do very well well so things quite limited as for would be fostered where the fast and so using however the mathematical operations to do something that is a is a 1 instruction operation processes true that
thank social career really interesting and great stuff him around the images and we'll thoughts on how neural networks could be applied to text analytics because most people don't do that some text analytics firm is
also my area so I don't know but uh I would speak to Catherine Jamal she's here and she did a very very good talk uh describing the sort of the students to give a really really good insuring really really good sort of overview of what this text processing world is like and she gave quite few non-animals new networks or some of the best models right now but it's also my expertise but she knows the stuff on the outside speak to come any other questions I
will then be the name of neural networks comes to the size of the plane dual if it's a used widely in in science of not sure and I think that
the and in the model that we use for our new networks that have been talking about here is quite different from my neurons in the brain work I think that my my very very basic layman's understanding of brain neurons is they they operate on spike rate so that they they they generally output spikes in the frequency of that is often the strength of the output of thing I don't know uh so I think that so trying to like in these 2 others so I I don't think of it that much alike I think that the way where the similarity is is that people look to how neurons in the brain up to recover the child and they said Hakim makes and models this is the what we've got something that seems to work well given of processes and systems produce very good pattern recognition so that uh as for similarities to bring beyond that controlled same or than any other questions hi how have you heard of the
self driving car using deep learning to you implement how the driving I wonder how the would update of the cost function because is a stream of we DO rather than a fixed static output I've heard about
burned from motion of the whole of doing it and at the future if you're to try and do something about that 1 of the things you could do is you could prepare a bunch of footage we say the human who's driving this call has done well as may have and crash to kill me 1 of them's history like that so the idea is that what's good and maybe if there some footage accidents that's a that's bad don't do all they do is to what will you probably want to do it is you want to say given this video however these outputs as and still like this acceleration brake this produce these decisions and so that's actually bit like a little bit like the Atari Game Playing networks works Google developed the stuff for the global of really good schools in the video games will take the input screens and they decide whether to move up down left right shoot but a similar thing where a set of deciding whether to move up down left right show you want you control the steerable accelerator and the brakes you could do it like that but given my experience of and given the father of as I mentioned we have particularly rare examples of rare situations where coffin your neighborhood as treaty not because they might make up 0 . 0 0 0 1 % your training set uh it'll never but it'll never actionable to learn anything from those list of the cost function you it'll it'll discover local minima the signals that I would not be very comfortable getting into a call that was just controlled on your network I would not want my life in the hands of the but does not behave could build the whether it would undergo the very good and the
questions hi do you ever combined neural networks of our techniques like an approximation algorithms approximation algorithms yeah like optimization techniques boasting about travel salesman problem for example of
and know it's even even tried for the unknown not were unaware that I would be surprised instructive unaware of some looks is afraid and this the going on forever sorry uh it argues that we have to figure out a way of constructing a cost function of a some measures of the solution so how long ago about symbolism I have time for
1 last question the you may be asking about technical question about the ads but uh when you apply dropout does the Ys expression get real recompile recognize tools the efficient need not to take account of the uh sort of of also weights or a uh the gets sequelae in all you be floating point operations and get to the GP or CPU all but there is you so that the effect the gradient I think it's a 2nd because what you do is you get to because random number generator
generate 0 1 and the multiple you put that multiplying expressions I think it's just selection optimizing something we quite difficult to optimize because the problem is on and for a single sample and the budgetary blocking out difference of subset of the the the units are not even sure how what actually go optimizing an efficient way because you gotta always select which user dropping elements the from that decide what you can do operations you can save you can't to go do it on the fly and can be quite tough some guessing it I would guess that doesn't it so now since
then on the question so and probably time chip was 1 of them talk and fj and probably as AS giants
Resultante
Multiplikation
Selbst organisierendes System
Güte der Anpassung
Gebäude <Mathematik>
Ruhmasse
Bildanalyse
Überlagerung <Mathematik>
Code
Computeranimation
Soft Computing
Informationsmodellierung
Code
Programmbibliothek
Vorlesung/Konferenz
Maschinelles Sehen
Neuronales Netz
Maschinelles Sehen
Neuronales Netz
Resultante
Knotenmenge
Informationsmodellierung
Gewicht <Mathematik>
Gewicht <Mathematik>
Wärmeübergang
Faltungsoperator
Programmbibliothek
Vorlesung/Konferenz
Neuronales Netz
Computeranimation
Neuronales Netz
Rechenschieber
Informationsmodellierung
Verknüpfungsglied
Materialisation <Physik>
Browser
Vorlesung/Konferenz
Wiederkehrender Zustand
Elektronische Publikation
Code
Computeranimation
Subtraktion
Transinformation
Klasse <Mathematik>
Profil <Aerodynamik>
Unrundheit
Code
Teilbarkeit
Computeranimation
Rechenschieber
Spezialrechner
Graphikprozessor
Virtuelle Maschine
Gewicht <Mathematik>
Menge
Datentyp
Vorlesung/Konferenz
Bildgebendes Verfahren
Einfügungsdämpfung
Kraftfahrzeugmechatroniker
Klasse <Mathematik>
Vektorraum
Objektklasse
Term
Computeranimation
Spezialrechner
Soft Computing
Gewicht <Mathematik>
Vorlesung/Konferenz
Bitrate
Maschinelles Sehen
Bildgebendes Verfahren
Auswahlaxiom
Schreib-Lese-Kopf
Neuronales Netz
Compiler
Regulärer Ausdruck
Übergang
Physikalisches System
Bitrate
Punktspektrum
Term
Computeranimation
Übergang
Netzwerktopologie
Spezialrechner
Software
Arithmetischer Ausdruck
RIS <Medizin, Informationssystem>
Gewicht <Mathematik>
Spieltheorie
Faltungsoperator
Vorlesung/Konferenz
Compiler
Bitrate
Neuronales Netz
Fehlermeldung
Neuronales Netz
Softwareentwickler
Web Site
Bit
Selbstrepräsentation
Ausbreitungsfunktion
Vektorraum
Zentraleinheit
Information
Datenfluss
Computeranimation
Spezialrechner
Deskriptive Statistik
Software
Neuronales Netz
Bildgebendes Verfahren
Gammafunktion
Neuronales Netz
Matrizenrechnung
Gewicht <Mathematik>
Klasse <Mathematik>
Zahlenbereich
Vektorraum
Ein-Ausgabe
Term
Computeranimation
Einfache Genauigkeit
Datensatz
Neuronales Netz
Gerade
Neuronales Netz
Funktion <Mathematik>
Resultante
Matrizenrechnung
Gewicht <Mathematik>
Element <Mathematik>
Matrizenrechnung
Automatische Handlungsplanung
Zahlensystem
Term
Computeranimation
Ausdruck <Logik>
Multiplikation
Einheit <Mathematik>
Sigmoide Funktion
Stichprobenumfang
Neuronales Netz
Bildgebendes Verfahren
Normalvektor
Nichtlineares System
Funktion <Mathematik>
Parametersystem
Lineares Funktional
Pixel
Vektorraum
Objektklasse
Portscanner
Mapping <Computergraphik>
Rechenschieber
Funktion <Mathematik>
Mereologie
Parametersystem
Ein-Ausgabe
Neuronales Netz
Lineares Funktional
Bit
Subtraktion
Wellenpaket
Kategorie <Mathematik>
Wasserdampftafel
Backpropagation-Algorithmus
Extrempunkt
Ein-Ausgabe
Computeranimation
Linearisierung
Prognoseverfahren
Funktion <Mathematik>
Mittelwert
Kostenfunktion
Vorlesung/Konferenz
Neuronales Netz
Funktion <Mathematik>
Neuronales Netz
Fehlermeldung
Nichtlineares System
Sichtbarkeitsverfahren
Lineares Funktional
Exponent
Likelihood-Funktion
Klasse <Mathematik>
Zahlenbereich
Element <Mathematik>
Ein-Ausgabe
Extrempunkt
Computeranimation
Funktion <Mathematik>
Kostenfunktion
Vorlesung/Konferenz
Bildgebendes Verfahren
Neuronales Netz
Sichtbarkeitsverfahren
Lineares Funktional
Lineare Regression
Subtraktion
Gewichtete Summe
Kategorie <Mathematik>
Gewichtete Summe
Ein-Ausgabe
Computeranimation
Linearisierung
Funktion <Mathematik>
Lineare Regression
Endogene Variable
Nichtunterscheidbarkeit
Vorlesung/Konferenz
Bildgebendes Verfahren
Fehlermeldung
Neuronales Netz
Funktion <Mathematik>
Parametersystem
Gewicht <Mathematik>
Derivation <Algebra>
Computer
Gradient
Code
Computeranimation
Gradient
Portscanner
Arithmetischer Ausdruck
Vorzeichen <Mathematik>
Tensor
Maschinenschreiben
Parametersystem
Gradientenverfahren
Derivation <Algebra>
Computeralgebra
Datenfluss
Fehlermeldung
Parametersystem
Wellenpaket
Gewicht <Mathematik>
Stapelverarbeitung
Derivation <Algebra>
Bitrate
Computeranimation
Gradient
Einfache Genauigkeit
Wellenpaket
Mittelwert
Stichprobenumfang
Vorlesung/Konferenz
Bitrate
Tabelle <Informatik>
Mittelwert
Prozess <Physik>
Wellenpaket
Stapelverarbeitung
Iteration
Computeranimation
Multiplikation
Iteration
Einheit <Mathematik>
Mittelwert
Stichprobenumfang
Gradientenverfahren
Vorlesung/Konferenz
Neuronales Netz
Parallele Schnittstelle
Ganze Funktion
Einflussgröße
Parallele Schnittstelle
Leistung <Physik>
Parametersystem
Validität
Stichprobe
Uniforme Struktur
Funktion <Mathematik>
Menge
Rechter Winkel
Digitalisierer
Parametersystem
Computerarchitektur
Perzeptron
Neuronales Netz
Fehlermeldung
Ortsoperator
Randverteilung
Zentrische Streckung
Gewicht <Mathematik>
Ortsoperator
Invarianz
Einfach zusammenhängender Raum
Computeranimation
Spezialrechner
Einheit <Mathematik>
Softwareschwachstelle
Digitalisierer
Zahlzeichen
Translation <Mathematik>
Translation <Mathematik>
Neuronales Netz
Bildgebendes Verfahren
Neuronales Netz
Sichtbarkeitsverfahren
Kernel <Informatik>
TVD-Verfahren
Wellenpaket
Gewicht <Mathematik>
Computeranimation
Spezialrechner
Knotenmenge
Einheit <Mathematik>
Neuronales Netz
Grundraum
Maschinelles Sehen
Bildgebendes Verfahren
Ortsoperator
Nichtlinearer Operator
Pixel
Analoge Signalverarbeitung
Spider <Programm>
Ein-Ausgabe
Soft Computing
Menge
Digitalisierer
Faltungsoperator
URL
Computerunterstützte Übersetzung
Pixel
Lie-Gruppe
Faserbündel
Neuronales Netz
Nachbarschaft <Mathematik>
Kernel <Informatik>
Einfügungsdämpfung
Gewicht <Mathematik>
Ortsoperator
Wellenlehre
Computeranimation
RFID
Kernel <Informatik>
Knotenmenge
Vorlesung/Konferenz
Maschinelles Sehen
Bildgebendes Verfahren
Gerade
Funktion <Mathematik>
Lineares Funktional
Filter <Stochastik>
Multifunktion
Pixel
Benutzerfreundlichkeit
Green-Funktion
Ein-Ausgabe
Biprodukt
Design by Contract
Rechenschieber
Bildschirmmaske
Soft Computing
Rechter Winkel
Faltungsoperator
Mereologie
Maschinelles Sehen
Neuronales Netz
Kernel <Informatik>
Orientierung <Mathematik>
Matrizenrechnung
Subtraktion
Gewicht <Mathematik>
Mathematisierung
Matrizenrechnung
Computeranimation
Kernel <Informatik>
Eins
Demoszene <Programmierung>
Multiplikation
Spannweite <Stochastik>
Bildgebendes Verfahren
Funktion <Mathematik>
Pixel
Einfach zusammenhängender Raum
Schwach besetzte Matrix
Quick-Sort
Office-Paket
Quadratzahl
Funktion <Mathematik>
Faltungsoperator
Mereologie
Speicherverwaltung
Computerarchitektur
Pixel
Neuronales Netz
Varietät <Mathematik>
Punkt
Extrempunkt
Matrizenrechnung
Extrempunkt
Computeranimation
Übergang
Gradient
PCMCIA
Stichprobenumfang
Konstante
Vorlesung/Konferenz
Neuronales Netz
Bildgebendes Verfahren
Stochastische Abhängigkeit
Nichtlinearer Operator
Pixel
Stichprobennahme
Ruhmasse
Bildauflösung
p-Block
Konfiguration <Informatik>
Portscanner
Energiedichte
Funktion <Mathematik>
Kantenfärbung
Maschinelles Sehen
Pixel
Resultante
Pixel
Stichprobennahme
Singularität <Mathematik>
Versionsverwaltung
Extrempunkt
Ein-Ausgabe
Faltungsprodukt
Quick-Sort
Computeranimation
Kernel <Informatik>
Portscanner
Funktion <Mathematik>
PCMCIA
Zahlzeichen
Faltungsoperator
Vorlesung/Konferenz
Pixel
Bildgebendes Verfahren
Funktion <Mathematik>
Neuronales Netz
Kernel <Informatik>
Wellenpaket
Datenmodell
Programmierumgebung
Paarvergleich
Computeranimation
Kernel <Informatik>
Spezialrechner
Metropolitan area network
Einheit <Mathematik>
Iteration
Minimum
Zahlzeichen
Bildgebendes Verfahren
Funktion <Mathematik>
Neuronales Netz
Kernel <Informatik>
Nichtlinearer Operator
Orientierung <Mathematik>
Zellularer Automat
Quick-Sort
Computeranimation
Kernel <Informatik>
Objekt <Kategorie>
Spezialrechner
Textur-Mapping
Quadratzahl
Rechter Winkel
Mereologie
Vorlesung/Konferenz
Kurvenanpassung
Lie-Gruppe
Neuronales Netz
Lineares Funktional
Einfügungsdämpfung
Mathematik
Compiler
Regulärer Ausdruck
Übergang
Computeranimation
Übergang
Rechenschieber
Arithmetischer Ausdruck
Mereologie
Vorlesung/Konferenz
Datenfluss
Neuronales Netz
Neuronales Netz
Funktion <Mathematik>
Resultante
Kernel <Informatik>
Bit
Mereologie
Zahlenbereich
Extrempunkt
MP3
Zahlensystem
Computeranimation
Kernel <Informatik>
Metropolitan area network
Stichprobenumfang
Inverser Limes
Vorlesung/Konferenz
Neuronales Netz
Filter <Stochastik>
Stichprobennahme
p-Block
Digitalfilter
Ein-Ausgabe
Faltungsoperator
Mereologie
Computerarchitektur
Kantenfärbung
p-Block
Neuronales Netz
Filter <Stochastik>
Bit
Punkt
Extrempunkt
Zahlenbereich
p-Block
Digitalfilter
MP3
Computeranimation
Kernel <Informatik>
Zahlensystem
Faltungsoperator
p-Block
Bildauflösung
Ortsoperator
Softwareentwickler
Resonanz
Stapelverarbeitung
Zahlensystem
Computeranimation
Spezialrechner
Vorlesung/Konferenz
Information
URL
Computerarchitektur
Normalvektor
Stapelverarbeitung
Neuronales Netz
Bildgebendes Verfahren
Funktion <Mathematik>
Neuronales Netz
Größenordnung
Matrizenrechnung
Einfügungsdämpfung
Wellenpaket
Gewicht <Mathematik>
Güte der Anpassung
Stapelverarbeitung
Zahlenbereich
Ein-Ausgabe
Computeranimation
Arithmetisches Mittel
Multiplikation
Rechter Winkel
Konstante
Faltungsoperator
Vorlesung/Konferenz
Größenordnung
Stapelverarbeitung
Normalvektor
Neuronales Netz
Neuronales Netz
Standardabweichung
Softwaretest
Wellenpaket
Datenmodell
Systemaufruf
Ausgleichsrechnung
Zeitreise
Computeranimation
Chirurgie <Mathematik>
Informationsmodellierung
Softwaretest
Prognoseverfahren
Gruppe <Mathematik>
Stichprobenumfang
Algorithmische Lerntheorie
Neuronales Netz
Neuronales Netz
Normalvektor
Softwaretest
Wellenpaket
Multiplexbetrieb
Term
Teilbarkeit
Computeranimation
Teilmenge
Rhombus <Mathematik>
Multiplikation
Einheit <Mathematik>
Funktion <Mathematik>
Stichprobenumfang
Randomisierung
Größenordnung
Tropfen
Normalvektor
Funktion <Mathematik>
Bit
Gewicht <Mathematik>
Wellenpaket
Backpropagation-Algorithmus
Versionsverwaltung
Unrundheit
Überlagerung <Mathematik>
Quick-Sort
Computeranimation
Teilmenge
Erweiterte Realität <Informatik>
Einheit <Mathematik>
Menge
Stichprobenumfang
Tropfen
Gerade
Neuronales Netz
Wellenpaket
Transformation <Mathematik>
Transformation <Mathematik>
Computeranimation
Arithmetisches Mittel
Erweiterte Realität <Informatik>
Einheit <Mathematik>
Vorlesung/Konferenz
Neuronales Netz
Bildgebendes Verfahren
Varianz
Informationssystem
Fitnessfunktion
Neuronales Netz
Aggregatzustand
Standardabweichung
Distributionstheorie
Zentrische Streckung
Lineare Regression
Pixel
Green-Funktion
Ablöseblase
Computer
Ein-Ausgabe
Speicherbereichsnetzwerk
Raum-Zeit
Computeranimation
Arithmetisches Mittel
Spannweite <Stochastik>
Funktion <Mathematik>
Reverse Engineering
Rechter Winkel
Lineare Regression
Ein-Ausgabe
Stichprobenumfang
Server
Vorlesung/Konferenz
Neuronales Netz
Pixel
Bildgebendes Verfahren
Standardabweichung
Funktion <Mathematik>
Neuronales Netz
Lineares Funktional
Einfügungsdämpfung
Wellenpaket
Punkt
Weg <Topologie>
Bitrate
Teilbarkeit
Computeranimation
RFID
Portscanner
Vorlesung/Konferenz
Bitrate
Standardabweichung
Neuronales Netz
Fehlermeldung
Konstante
Extrempunkt
Konstante
Klasse <Mathematik>
Mustersprache
Vorlesung/Konferenz
Extrempunkt
Neuronales Netz
Computeranimation
Instantiierung
Resultante
Extrempunkt
Zahlenbereich
Element <Mathematik>
Extrempunkt
Computeranimation
Soft Computing
Kostenfunktion
Digitalisierer
Konstante
Vorlesung/Konferenz
Maschinelles Sehen
Maschinelles Sehen
Neuronales Netz
Expertensystem
Neuronales Netz
Distributionstheorie
Bit
Wellenpaket
Einfache Genauigkeit
Komplex <Algebra>
Computeranimation
Einfache Genauigkeit
Mehrrechnersystem
Flächentheorie
Mereologie
Mustersprache
Hypercube
Randomisierung
Vorlesung/Konferenz
Neuronales Netz
Figurierte Zahl
Bildgebendes Verfahren
Funktion <Mathematik>
Neuronales Netz
Netzwerktopologie
Wellenpaket
Quader
Mereologie
Zahlenbereich
Vorlesung/Konferenz
Bildgebendes Verfahren
Computeranimation
Entscheidungstheorie
Spezialrechner
Quader
Metropolitan area network
Videospiel
Punkt
Wellenpaket
Endogene Variable
Stellenring
Inverser Limes
Geräusch
Vorlesung/Konferenz
Neuronales Netz
Bildgebendes Verfahren
Computeranimation
Spezialrechner
Metropolitan area network
Orientierung <Mathematik>
Wellenpaket
Ortsoperator
Wellenpaket
Minimum
Vorlesung/Konferenz
Neuronales Netz
Bildgebendes Verfahren
Computeranimation
Schreib-Lese-Kopf
Gewicht <Mathematik>
Versionsverwaltung
Datenmodell
Systemaufruf
Wärmeübergang
Objektklasse
Computeranimation
Spezialrechner
Gewicht <Mathematik>
Wellenpaket
Vorlesung/Konferenz
Neuronales Netz
Bildgebendes Verfahren
Neuronales Netz
Attributierte Grammatik
Dualitätstheorie
Klasse <Mathematik>
Regulärer Ausdruck
Turing-Test
Extrempunkt
Computeranimation
Spezialrechner
Metropolitan area network
Informationsmodellierung
Last
Mailing-Liste
Vorlesung/Konferenz
Tropfen
Neuronales Netz
Bildgebendes Verfahren
Gammafunktion
Funktion <Mathematik>
Nichtlineares System
Demo <Programm>
Parametersystem
Datenmodell
Magnetooptischer Speicher
Vektorraum
Ein-Ausgabe
Variable
Menge
Faltungsoperator
Parametersystem
ATM
Computerarchitektur
Modelltheorie
Personal Area Network
Neuronales Netz
Resultante
Wellenpaket
Güte der Anpassung
Wärmeübergang
Ordinalzahl
Computeranimation
Spezialrechner
Dämpfung
Gewicht <Mathematik>
Vorlesung/Konferenz
Neuronales Netz
Bildgebendes Verfahren
Neuronales Netz
Inklusion <Mathematik>
Bit
Subtraktion
Mereologie
Gewicht <Mathematik>
Kategorie <Mathematik>
Klasse <Mathematik>
Stellenring
Computeranimation
Portscanner
Objekt <Kategorie>
Task
Spezialrechner
Metropolitan area network
Informationsmodellierung
Task
Gewicht <Mathematik>
Mereologie
Vorlesung/Konferenz
URL
Neuronales Netz
Bildgebendes Verfahren
Dimension 2
Resultante
Parametersystem
Wellenpaket
Güte der Anpassung
Wärmeübergang
Extrempunkt
MP3
Computeranimation
Eins
Quader
Randwert
Domain-Name
Font
Wellenpaket
Parametersystem
Wärmeübergang
Vorlesung/Konferenz
Abstand
Neuronales Netz
Neuronales Netz
Geräusch
Computeranimation
Kreisbogen
Endogene Variable
Demoszene <Programmierung>
Spezialrechner
Uniforme Struktur
Datenfeld
Minimalgrad
Bereichsschätzung
Endogene Variable
Faltungsoperator
Vorlesung/Konferenz
Neuronales Netz
Bildgebendes Verfahren
Neuronales Netz
Instantiierung
Metropolitan area network
Parametersystem
Orientierung <Mathematik>
Reverse Engineering
Datenmodell
Computeranimation
Digitale Photographie
RFID
Arithmetisches Mittel
Objekt <Kategorie>
Gefangenendilemma
Textur-Mapping
Spezialrechner
Iteration
Gewicht <Mathematik>
Funktion <Mathematik>
Reverse Engineering
Ein-Ausgabe
Parametersystem
Vorlesung/Konferenz
Wort <Informatik>
Datenstruktur
Neuronales Netz
Bildgebendes Verfahren
Neuronales Netz
Inklusion <Mathematik>
Spezialrechner
Bit
Reelle Zahl
Vorlesung/Konferenz
Wort <Informatik>
Bildgebendes Verfahren
Computeranimation
Neuronales Netz
Touchscreen
Resultante
Sichtbarkeitsverfahren
Parametersystem
Matching <Graphentheorie>
Punktspektrum
Computeranimation
Spezialrechner
Metropolitan area network
Arithmetischer Ausdruck
Informationsmodellierung
Vorlesung/Konferenz
Wort <Informatik>
Faltung <Mathematik>
Grundraum
Touchscreen
Neuronales Netz
Prozess <Physik>
Wellenpaket
Thumbnail
Zahlenbereich
Schlussregel
Ausnahmebehandlung
p-Block
Ein-Ausgabe
Formale Semantik
Mustersprache
Faltungsoperator
Computerarchitektur
Information
Bildgebendes Verfahren
Zeichenkette
Neuronales Netz
Addition
Topologische Einbettung
Prozess <Physik>
Selbstrepräsentation
Gruppenkeim
Vektorraum
Schwach besetzte Matrix
Term
Zeitzone
Quick-Sort
Differenzengleichung
Wort <Informatik>
Schlüsselverwaltung
Figurierte Zahl
Funktion <Mathematik>
Neuronales Netz
Addition
Nichtlinearer Operator
Informationsmodellierung
Multiplikation
Einheit <Mathematik>
Digitalisierer
Analytische Menge
Figurierte Zahl
Bildgebendes Verfahren
Informationssystem
Neuronales Netz
Ebene
Informationsmodellierung
Prozess <Physik>
Flächeninhalt
Güte der Anpassung
t-Test
Dualitätstheorie
Quick-Sort
Neuronales Netz
Hydrostatik
Streaming <Kommunikationstechnik>
Informationsmodellierung
Prozess <Physik>
Kostenfunktion
Güte der Anpassung
Ähnlichkeitsgeometrie
Physikalisches System
Mustererkennung
Frequenz
Neuronales Netz
Funktion <Mathematik>
Nachbarschaft <Mathematik>
Videospiel
Approximationsgüte
Bit
Approximation
Schießverfahren
Wellenpaket
Extrempunkt
Minimierung
Güte der Anpassung
Systemaufruf
Mailing-Liste
Ein-Ausgabe
Videokonferenz
Entscheidungstheorie
Travelling-salesman-Problem
Computerspiel
Menge
Spieltheorie
Kostenfunktion
Touchscreen
Neuronales Netz
Soundverarbeitung
Nichtlinearer Operator
Punkt
Gewicht <Mathematik>
Compiler
Symboltabelle
Automatische Differentiation
Zentraleinheit
Zufallsgenerator
Gradient
Arithmetischer Ausdruck
Reelle Zahl
Kostenfunktion
Einflussgröße
Teilmenge
Nichtlinearer Operator
Arithmetischer Ausdruck
Subtraktion
Einheit <Mathematik>
Trennschärfe <Statistik>
Stichprobenumfang
Element <Mathematik>
Metropolitan area network
Computeranimation

Metadaten

Formale Metadaten

Titel An Introduction to Deep Learning
Serientitel EuroPython 2016
Teil 166
Anzahl der Teile 169
Autor French, Geoff
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
DOI 10.5446/21145
Herausgeber EuroPython
Erscheinungsjahr 2016
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Geoff French - An Introduction to Deep Learning Deep learning: how it works, how to train a deep neural network, the theory behind deep learning, recent developments and applications. ----- (length: 60 mins) In the last few years, deep neural networks have been used to generate state of the art results in image classification, segmentation and object detection. They have also successfully been used for speech recognition and textual analysis. In this talk, I will give an introduction to deep neural networks. I will cover how they work, how they are trained, and a little bit on how to get going. I will briefly discuss some of the recent exciting and amusing applications of deep learning. The talk will primarily focus on image processing. If you completely new to deep learning, please attend T. Rashid's talk 'A Gentle Introduction to Neural Networks (with Python)'. His talk is in the same room immediately before mine and his material is really good and will give you a good grounding in what I will present to you.

Ähnliche Filme

Loading...