Merken

# A Gentle Introduction to Neural Networks (with Python)

#### Automatisierte Medienanalyse

## Diese automatischen Videoanalysen setzt das TIB|AV-Portal ein:

**Szenenerkennung**—

**Shot Boundary Detection**segmentiert das Video anhand von Bildmerkmalen. Ein daraus erzeugtes visuelles Inhaltsverzeichnis gibt einen schnellen Überblick über den Inhalt des Videos und bietet einen zielgenauen Zugriff.

**Texterkennung**–

**Intelligent Character Recognition**erfasst, indexiert und macht geschriebene Sprache (zum Beispiel Text auf Folien) durchsuchbar.

**Spracherkennung**–

**Speech to Text**notiert die gesprochene Sprache im Video in Form eines Transkripts, das durchsuchbar ist.

**Bilderkennung**–

**Visual Concept Detection**indexiert das Bewegtbild mit fachspezifischen und fächerübergreifenden visuellen Konzepten (zum Beispiel Landschaft, Fassadendetail, technische Zeichnung, Computeranimation oder Vorlesung).

**Verschlagwortung**–

**Named Entity Recognition**beschreibt die einzelnen Videosegmente mit semantisch verknüpften Sachbegriffen. Synonyme oder Unterbegriffe von eingegebenen Suchbegriffen können dadurch automatisch mitgesucht werden, was die Treffermenge erweitert.

Erkannte Entitäten

Sprachtranskript

00:01

good morning everyone and so this is the last year Python and in this session we

00:06

are actually dealing with some very hot topics uh that I'm really interested in them that's neural networks and deep learning and to begin with we have our 1st speaker dealership uh he is given a lock on a gentle introduction to neural networks so please give a kind of a plus few

00:30

the rule can you hear me this is what the you

00:32

know who might as well thank you very much for and coming to my talk and for me it's really great to be term open community open-source conference I always learn a lot actually I'm always of great conversations and is always a very generous spirit so on thank everyone and also the organizers so nice 1 and and you talk about some neural networks and I just want to be really clear and my talk is very and tends to be very introductory itself and this so people who perhaps don't know what neural

01:06

networks are or how they work well maybe you study them along time ago and forgotten so that if you already know what they always uridine experts you might be bored so don't mind if

01:16

you if you want to go to another talk co at at and my name surrogacy of let's say less and I am 1 of the co-organizers of London Python and and if you want to come and do something with this and please come along and have a chat with me we really wanted you more so broad things everything from computer art to and

01:36

teaching people to code right yeah yeah and cool so on as is there this is an

01:55

introductory talk and we'll talk a little bit about the background of what is artificial intelligence why there lot interest in your notes moment and then will get into

02:04

the ideas and that's the the me to this talk really it's what are the concepts that are used and in neural networks and what all you do is all all you some very simple common

02:16

audio examples which which may not seem that interesting but they illustrate the very key points that make you know its work so I hope that you stick with them

02:26

and that will help us understand what's going on inside a neural network and

02:31

will also apply them and I'll give you an example of applying your networks to quite interesting and and challenge recognizing handwritten numbers and I'll give some pointers on how you might code and and I might regret this kind of a live demo at the end and and if it goes wrong

02:51

that's going to be really embarrassing but but but give it a go and I'm not going talk about and libraries and there's lots of cool stuff out there there's the I there's tons of road and and this double to talks today and covering things like that someone is really about the concepts what's really going on and how you might do right so

03:13

just to get us into and the con the right frame of mind and from stalwart 2 questions so I have a

03:19

7 year old daughter and I and she likes challenges so a set of this challenges said can you look at this picture and point out where the people up and and as a 7 year old child she found that quite exciting very easy they like 7 year children and she can't that that that the people in the picture and that was that's fine and she's she can do numbers you can actually subtract so so said can you add

03:43

those numbers and then she

03:45

found that very difficult but you know and that with coding with computers with Python and doing the calculations such as the 1 on the right is actually very easy but in structuring the computer to find people in the photo is not so easy so that's

04:03

interesting that's easy for us and that's easy to code for computers but that's hard to get computers and

04:12

that's easy so there's something there and we would like to build to solve these kinds of problems interesting now find me a picture of a cat and work out what this sound sample what the words are in this sort of audio file you know that those are really interesting problems and we want to more of them and you know the terminology like artificial intelligence means different things to different people so for me it means being up to solve the kinds of problems that traditionally have not been found

04:42

that straightforward so that's what that's what this is about and or hyper moments because there's lots

04:49

of them stuff going on that you know you want to miss this autonomous cars this health data being used to improve on and outcomes and Google's been very kind of active recently with and they're being able to play

05:03

go which which is amazing we thought that it would take another so 20 years and they used in neural networks as part of their solution so that that in the interest people and

05:14

and that's what we'll talk about it so

05:18

let's go right back to the beginning in a really really assuming the so we want to ask the computer a question and we want an answer and it's produce some kind

05:30

of thinking but clearly can't think it's a it's just in a metal and wires and so has to

05:36

calculate and that's the has to process and those words I guess programmers like ourselves understand we have inputs we have some kind of calculation and we have adequate and neural networks and artificial intelligence that's that's all it is is nothing mysterious about it is just

05:53

calculations this can be done at so that's that's ourselves

05:59

a very very simple example just to get started and imagine that the conversion from kilometers to miles is a difficult problem just imagine I know it's not you know it's not list of commodities and imagine we didn't know how to do it so we invent a model in our mind we say maybe 1 is the other

06:19

1 multiplied by number that's a model we can come up with a model we think you might be right you might be wrong you know try and let's start with and the number we don't know whether you

06:30

could be miles is formed times 100 almost km times to list all the number will start with . 5 and if we compare it with with real examples of truth as we know it should be 6 2 . 1 3 7 but our model and calculated 50 it's not that bad and there is an error of 12 to occasional on this

06:56

list we got 1 6 like gives a better and and

07:02

and so still not exactly right but the error is much smaller now and strike again put 7 where we've gone too far and it's much worse now that's that's obvious that enthusiastic about jumping and try . 6 1 and that's actually going quite close so

07:22

this this idea of using a model and tweaking a parameter inside and then comparing the output with with what we know should be true is how neural networks work and many other action machine learning methods we use the error that pops out the other end and use that to kind of tweaking guide the refinement of the parameters of the model I just clear and that's a super easy example but that's what the neural network is doing if you just replace a circle within your network that's really what's happening your training it is looking at the area and your tweaking parameters inside it to

08:01

try and get a better answer and Bingo and that's a and

08:09

again so this a key points there and if we don't have something really works and you know we we we we have not an exact mathematical model we can invent we can we can come over the model that we think might be true but we can try it and we can have promises that we can adjust and important point there is the error is used to refine the model so let's this take our

08:33

daughter the garden where she likes to

08:34

pick up the bugs and and she's picked up some and caterpillars ladybirds and imagine that we've plotted the monograph with width and length so caterpillars are thin and long and ladybirds are short and wide and if you plot we can see this 2 clusters 2 groups there which is interesting and some of you

08:57

will recognize this as clustering but that's that's cool

09:03

what we did 1st with our 1st example was have a linear online a predictor and the relationship between kilometers miles away we thought was a straight line with and we change the parameter we change the slope will trying to do here is to see if

09:19

we can apply the same simple model and see if we can come up with a way of predicting will classifying them what about should be so that line instead of

09:30

being a prediction line it could be a separating line so things on 1 side lines are 1 type of object caterpillars things on the other side of the line might be on the lady words that's not what want because that doesn't separated 2 kinds of this

09:48

1 doesn't either and 1 does so

09:52

you can see that's learning to classify is not that different from the very 1st simple example we looked at and the and you can also see through this kind of naive animation that we're changing the slope is away all of and learning to find a good separation line between the 2 clusters so ones with if we learned that line if we've learned a good separating line if we then

10:20

find an unknown but you can say well that's that full then that's half of the of the space so it must be a caterpillar so

10:31

classifying things is it's contradicting things so we apply these methods and when we don't really know what the model should be be but what we do have is real data so we learned from data we invent a model we think it's a good 1 and we try to refine it and to match the data that we've collected it might be in data from space in the microwave background radiation which we about earlier in the week it might be voice data it might be a sentiment and we're gonna stick

11:06

with the super super simple data dataset consisting of 2 items that um and just the width and length of 2 books here we

11:17

reported them so we start again with them a randomly chosen and parameter for that line a randomly chosen gradients and we say OK that's not some that's not good because it doesn't separated 2 lines so let's look at the 1st example there but we need to shift the lineup 2 point OK with improved is not that kind of thing does a good job no matter what can we learn from the 2nd well we knew we look at the 2nd example we say alright the separator must keep that example on that side of the line and that kind of works and if you're interested in the mass it's really single we've we've got straight lines and it's very so simple linear algebra you can rearrange the tends to work out what the change in Britain should be if you want to and get line to go through a certain point but actually we and a kind of a mistake here is what we've done is we've looked at an example and ignored all the previous ones before if we don't do that we want to learn from all the data are not just the last 1 looks at if we did this we would work through all the examples and we just um have an answer which you could have got by looking at the last example so 1 way of doing that is actually not to be so enthusiastic about the euro and the the amounts that you jump up by what you what you can do is you can say instead of

12:53

jumping forward toward changing the line by line of 5 we apply a factor learning rates so we only jump a little bit safer before the 1st example wants me to go over there I just move in that direction a little bit if the next examples which go over there and we're a little bit so with lots of data you can see

13:13

that she eventually get better and better we're not overly influenced by each individual data points and that's good because the data is noisy and they can be analyzed can be errors in the data you don't want to you know over gives too much importance to any 1 individual data points and I learning rate is and quite

13:31

important some an idea your networks will will understand why in a minute so let's and increase the amygdala the complexity bit to imagine we have dataset which really and has something to do with the real world and its causal so maybe I'm measuring the amount of smiling in this room and the other factors in measuring or whether it's sunny and whether this weekend so you can see if it's sunny and it's the weekend money more smiles or if the sun shining and there's no clouds the tension might go you can see that in the real physical world data can have causal links and we want to build a model that actually not that would be a great thing to do to model and we have to predict so classified data that comes in real world so here's some simple examples here and you know we have on the hooligan relations we have and the and relation relation if 2 conditions are true and the 3rd 1 is that the the the and is only true if the both inputs true and so some data can be like this and can we can we model that with the very simple example the simple classifier that we've got just essentially under that the thing we did at the start of of the picture is nothing wrong with having 2 inputs into a calculation

14:57

but that's OK do you can visualize them and that the data by

15:02

saying really we can plot the sort of you know the the 2 inputs as coordinates and we can see the output is colored so if we have an and relationship recovery green if it's if they're both on and yes the dividing line

15:17

still works we can still have a linear classifier to separate all data which has 7 and and kind of causal link in there and say we'll just call we could use that very simple very candid

15:33

naive and classifier to learn data which has the and uh then of course in it or it was glorious such as school that's that's that's looking hopeful that actually in in

15:47

history and I think it was probably in the seventies and people sort of have became sad because somebody wrote a paper that said actually these simple classifiers a very limited and and is because this simple classifiers linear classifiers that can't land data which has the x or relationships so if I have 2 variables which are related to the answer with an X or the only true for me that of the inputs are true but not both can't do that and you can see why you can see visually no line and correctly separates those 2 classes said so that led to the development and I guess a bit of a slowdown in in research in neural networks and but if you look at this you think well against this kind of there already we have 2 lines at and this is an important point as you know this is a very simple example but what this suggests to us is that actually we need more than 1 of those classifiers to help us with data that's more complex and that is actually 1 neural networks have many many nodes more than just 1 node and that's that's that's the thing that's not an important point so some problems can't be solved we just a simple linear classifier we can know that and but it's the motivation for what you might want to explore using multiple nodes let's take a shift a little bit and look at them and nature again we started right at the start with the example of my daughter's brain being able to find people in a photograph but me not being up to code that very easily so the human brains doing something and working in a way that's different from my kind of in a laptop here wants to work and it's you know right history people tried to understand what is it

17:53

about the way biological brains work that makes him so good what can we learn from natural replicated in new kinds of algorithms and actually you just just have a look at you you know this computers got what is it 16 data Ramond how many of them make give kind of instructions per 2nd it's it's quite an quot chunky and yet page in with which is the brain of point 4 0 grams and it can fly you can

18:22

learn to eat it can communicate and the it can learn to do new tasks that's really important and a snails by 11 thousand neurons that's not really you know we we can store sort of you know with big data and all these in a huge amounts of and data structures and these things have just 11 thousand this and this work has 302 neurons in uncharted for that in a in a microgrid and the fact that and this is interesting and there is a species of helpful of whale and which has 37 billion neurons but we humans of 20 and it's using them because if it

19:06

wasn't using them it would have evolved away because the cost missions think that maybe we're not the most um as superior things of plants but anyway the point here is that term in nature is doing something with with brains that we can learn from and you know which with apparently such small resources they're able to do tasks which we think of quite complicated so those neurons that that the

19:36

biologist and no or inside our brains and nervous systems 3 look at them what they do is they kind of transmit a signal along and on to another 1 and there's adjusted to conclude the names for the various elements but what they

19:52

don't do is there and sort of pass a signal on and then kind of you know without any kind of resistance what they do is they only pass a signal on once the signal is kind of passed threshold recite turning up a dialog on the light goes on after I reached a certain kind of number and so maybe our computing neurons that we model maybe they should do the same and some people think what could use a step function to do that so if the input is past a certain point and its which is on and actually you have you could do that um but in nature we know that things aren't always sort of black and white and hard-edged things softer and so we we might try it has softer and kind of function that we the sigmoid function and there are others that you could use and we know the nature of these things are connected like sort of a network a mesh and signals going along so maybe that's what we should try and model when we want to do some interesting and test recognizing pictures and again going back to the thing we saw writers start there's nothing wrong with having more than 1 inputs coming into computing kind of node and and what we've just said here is that we're collecting the inputs just as they do in in the in the in nature and begin to apply a threshold function so that we only have magical effect combination is big enough and that becomes our node in a neural network so after and the minutes of talking omega running out time and reflect the of that's all it is a network in neural networks and artificial neural network is our attempt to try and recreate what this biological brains doing and each of those circles is doing what we saw here collecting the signals by applying a threshold function and passing the output and

22:01

it is convention that we call these layers we have the middle layer given input where have not put layer

22:06

and then there are some connections so let's

22:13

pause a little bit and think without very very 1st example where we wanted to convert 2 miles to kilometers we had a straight line with an adjustable slope of parameter that's what did the learning the learning was the changing of that slope that that kind of multiplication factor what's learning in a neural network what do we change what do we need to tweak so that the outputs of better there is probably lots of answers to that you might say that function that is threshold function that of maybe we need to change the slope SlipAlert in each of those nodes that's that's probably innocent nor bad idea and actually what we're history is taken as what people do is adjust the links the strength of the links between those some nodes so if if a link is strong from a signal is amplified if a link is weak it is can reduced and if a link is 0 effective if the weight of the strength is 0 the effects of the break break link so that's that's 1 approach and that's will not become popular for because

23:21

it's easier as well so when we

23:28

feed signals forward so let's just imagine what signal 1 of the top there and we have a link there go called between 1 and 1 and you can see it's also a white the strength of point 9 what we would do is take the signal 1 multiply the . 9 and that's what freezing the next node same . 5 times point 3 is is what would go there that's really easy that's that's not complicated at all and that's that's what is happening inside a neural network just multiple multiplying signals through connections and freezing them on to the next node in collecting them the that's

24:08

just a reminder overdoing recall the signals coming in we add them up but this time

24:13

you can see that we're waiting them reusing the weights of those links to kind of either boost or reduce the signals of color them their sequences you can see the size of Trajan after the calculation of the year you can see if you want to you can verify that is that times that sentence and that would give you that answer and that's really is a simple as as as gets and that's on your network is doing nothing

24:41

nothing very complicated marital so we had a very

24:48

simple network here we just and therefore nodes if we rotate with a pen and paper what is happening at each node so at that node number 1 and layer 2 if we rotate what's actually

25:02

happening with say it's the input 1 times that weights plus input 2 times that weight if you registered for this 1 and we restaurant again for all the nodes you start

25:12

to see a pattern and that pattern is really helpful because that allows us to write that calculation as a matrix multiplication so the weights matrix times input signals becomes the um signals

25:29

are going to the next term layer and that's really really valuable to us because of 2 reasons it allows us to

25:40

write that calculation in a much more concise way so we don't have to write pages and pages for big networks which is right wait times input is the the signal into next where the other reason that's really important is because computers number 1 can accelerate matrix multiplications and we want to take advantage of that whether it's numb find whether it's fortran libraries we had about earlier whether it's hardware acceleration so using your graphics card to multiply matrices and if we can formulate our calculations in terms of major

26:16

cities then we can take advantage of the acceleration as possible so you might say all my matrix is so boring why would you have to do this again but this is the reason I would say so that's cool we're kind of feeding a signal for forward to each layer of the network and we get an answer at the other end we know that's actually we're likely to be wrong just like we were at the start so we have an error and going back to the very 1st example again we use etc. 2

26:52

refine and improve the and parameters inside the model how do we do that here so let's break it this

27:00

and that's kind of have a simple network as simple picture just to see what might happen well we know

27:06

that we need to change the weights that we've we've already agree that we can change those on the strength of those links in order to try and improve the answer that's what playing with that's the parameters of a tuning and we know what's the what's the error um we know the is right at the end of the network if the answer should be fired and we get 3 there was to what's here inside the network is when you need to know the area in order to change the the weights so that's that's quite an interesting question and actually lots of of the guys in the books from above so that a little bit what we could do is watch the 1st thing to say that this probably know kind of and mathematically perfect answer so what we we do is we can

27:54

think the word is heuristics we think well what will will be a intuitive way of at the air inside the network and 1 intuitive idea is to say let's spaces errors 5 maybe I push 2 and a half this into enough that way that's an idea and another idea is to say that toppling 3 where the weights 3 contributed more to the area because it is a bigger stronger link it magnified signal maybe I should put more error in that direction so you splits the error proportional to the links so if if our government

28:37

weights of 3 and 1 day the links of string 3 and 3 1 1 you can see 3 quarters 0 would go to the top node and a quarter of go this way or that kind of makes sense and I'm sure there's more sophisticated things you can do that we want to keep it simple especially if it works so you can see there actually that some the error from that node and is is being kind of Split and pushed back and the same here and the internal nodes you actually collect the several fractions of error that he that link to it sounds complicated but when you see it as a picture you can see the errors flowing backwards back propagation of error that's where the term comes from error backpropagation exists feeding signals and back-propagating errors of to the calculations you can begin afterward at the point here is that your summing up the error so if the errors . 6 from that top right and this was comforting . 1 there is an . 7 supported 6 plus point 1 just collected the areas and again it's really nice really fortunate that if we did write out what was really happening in in terms of the variables and we becomes a matrix multiplication again which is really nice because we can accelerate that and we can write it in a very concise way without worrying about the actual size of the network the only slightly different because a weights matrix

30:11

is then transpose and diagonally against super super

30:16

simple OK so we've got the

30:22

errors now at each node How do we have a adoration added we change the weights OK and so that's the output at 1 of the output nodes those WC are the weights all those links inside and I'm not revealed to untangle that if you can you know well and that's horrible so what we need to do here is to say we're not going to build to kind of entangle that in any kind of nice mathematically clean way let's find other mathematical methods which are perhaps approximate but good enough so that's got a bit of a journey

31:04

imagine this landscape is a complicated function like the 1 we saw and if it's a narrow function and because that's what we had here is the output and the error function is

31:18

simply that minus for the of the

31:21

actual target should be and if this horrible complicated lumpy landscape is is a very deep complicated function which we can't

31:32

work out analytically and it would nice clean algebra another way to kind of work with it and maybe work at the minimum minimum areas is to sort of say well if this was a landscape and I don't

31:47

have a map of everything and that it was dark idea couldn't understand the whole function but I did have a torch what could do 2nd point the torch uh down in my feet and say well with its slope is going in this direction take a few

32:02

steps it's going in that direction taking the steps and eventually and you would work your way down to a minimum and some of you will we hands up and say it might not be the best minimum will come to the so this this approach which is not mathematically kind of clean and it's an approximate method but it works really well and and you can see it working really well with a little less pretend x squared function is really

32:30

difficult masses pretend that you can say that you know we started a point and we see where the gradient is locally and we can move in that direction and we keep doing that you get to the minimum um which which works must come nice and you might even be more sophisticated and say as this loop get smaller and you might take smaller steps because you're getting closer and closer to the real minimum you don't have overstep its that's an idea that's actually used in neural networks as well see that if that Arafat complicated function um was the error function then we

33:09

have a way finding that you have a picture to show you there is

33:16

so if we have the weights which is what we want to um kind of improve and we have an error function which is complicated we want to use this gradient descent method to find the minimum all that error and will then know what the right way to be safer over here with the wrong way begin at higher and begin to try and say OK I want to improve my position and move down the error function to somewhere where there is smaller and then the weights that will tell you what right weights non-negligible this goal gradient descent and it's a way of working with that horrible kind of an expression that we could

33:58

kind of to analytically before he she did write descent again with

34:04

pen and paper and and and worked out of the gradient locally it's not that hard amino induces an abrupt motions want to look at very simple calculus is the contact was you do at school just using the chain rule and nothing more complicated than that so if you are interested in having look this is what I hope is a very clear kind of blog post on that so we've what we're doing now is we've we've been we've worked out a way of improving the weights based on the gradient of the function and here you see this kind of pressure many times where we iterate you keep improving yeah OK i've only

34:46

got of a bit of time left i'm gonna zoom through and so how you might do yourself and I'm not an expert Python code from this people here law but broadly speaking you know if you wanted to do this

35:00

yourself you might think what would a Python kind of programmer class look like well with we know we've got to initialize this data structure this network that is really simple you know all we to really do is set the size

35:14

and initialize these weights to random initial values and we know that we're going to have some way of training the network so we doing the learning and we've got all have a away method of querying the network so we ask it questions and get an answer back and it's you know you if you go want to make your own and a new network library or class does nothing more complicated than this at all and I had to get enough

35:42

to so frontal land and and this is very useful kind of Python libraries number is great for matrix by multiplications and as you always reached at that site because

35:54

that's a nice some of functions in there for doing by that that curve to square that curve of graph the official function its call have built in so you can we can use that yourself and cutting things is that problem can use the things and I started programming a long time ago and then I stopped so I was

36:13

coding Python in 1999 i think 98 Python 1 . 5 0 . 6 Numeric library rather than number anyway no boosting exist there so I came back to Python and members of fantastic excellent so this this is the sum an example of

36:31

function which initializes the network it looks complicated and don't be put off all it's really doing is setting the size in terms of input nodes and nodes the output nodes and consistency here and using a number by function to randomize and the the weights which are a matrix that's it nothing more complicated in that the suicide by function there well expert always for expert that's the logistic function the curve will do the training that's occurring again really easy we take the inputs um that's of tournament in and it was a list here we can see that there is a matrix multiplication employed or not to do that the the calculation and that's it we apply the activation function to the outputs that modifications simple as that you then go the signal at the next layer how to do it again that's

37:29

it it's the simplest form propagating the signal through a neural

37:34

like as simple as that I'm sure I can make it here even more concise but um I just want to let you know really by doing this that there is not mysterious all scarier complicated really is as simple as just that and the training again is wall look scary but it isn't really the top half is exactly the same as what we just had graffiti signals forward in exactly the same code as before and then what we're doing is we're saying the Hanford errors is a target which from a training data minus what we've what worked out and then we use another set of matrix multiplications to work at the the errors in turn leads the network and then we change the weights

38:21

using that expression that we worked out with calculus that's it that's how you train work I'm sure if make even more beautiful and clean but I just wanted you to kind of get the feel for what's really going on in the network and it's not that complicated itself OK for the few minutes left I'm

38:38

just going sort of show you that with just very simple ideas that we looked at I mean you know people will have many more kind of you know sophisticated methods and optimizations and you can read quite a lot about neural networks with just with a very very simple ideas that we've looked at you can do so powerful things so we can train a network to have learned to recognize human handwritten numbers is a famous challenger dataset quality in the states that and it's got 60 thousand um training examples and it's all Free Open Data you can get yourself from the you to them up broke you you can point you to it and and there's some of the test set as well you can compare your results with others if I looked at the data you'll see the numbers they're actually the profits image using the what is not problem because you that's that's a 5 28 by 28 pixels so if we feed those found that data into a network and training actually I'm I'm missed something that we have to choose what the output looks like and what i've chosen is to say we have 10 nodes at the output and and if the answer should be say 9 the Ninth 1 has the biggest value so you can see if the if if it's 5 that is the 5th node that should be of a high value on having after below

40:06

that's what I'm using to train the network that last example is interesting because in this 1 the network thinks the answer probably 9 but it might also be for and

40:19

then you know you you you can get some really good results just with those simple ideas 96 % accuracy is what our government does go that's not bad you know so 20 lines of code and steered handwritten human in their numbers handing it over 90 % accuracy but that's that's that's not bad at all is that I think should apply you can begin to things like the learning rate of the number of hidden nodes and you can see that you can get improved performance then it might not be so much have deliberately put in and the middle 1 keep graphs there just to remind us that neural network training is a random process with starting off with random initial weights and sometimes you can go wrong and that's for the scientists among this

41:08

year reminds us that we should do this many many times and take the best of all water make sure that we not put enough and almost kind of answer remember that great descent before you might end up with the wrong minimum of the best minimum this western this many times if you rotate your arm and the original text that you

41:30

can get my tapes and I think this is actually the good because if if you look at the academic papers they get sort of you know 99 % 99 . 5 % the using really advanced techniques and I think you know it just 21 circuit that's not bad I'll just

41:48

skip this somewhat in areas how to say you can should do this the rest reprise their everything of Don and you can look at the blog well you can do with that res reprise 0 which costs about 4 or 5 euros and you don't need an NVIDIA graphics got to the end of this so the last few minutes on you know try and do alive

42:06

demos and I might said that that

42:09

might regression so this think of a number to classify this that some things a number 7 against 1 digit had 7 this right here His 1 I didn't hear actually

42:24

from a newspapers that's number 2 and a guy that right this is a network of trains last nite so that's

42:29

a good thing OK for other 3

42:37

if it doesn't work is a marvel at at at at OK let's resize that's to 28 by 28 let's say that PNG holders couldn't get of you want to explore alright if Cyprus

43:03

go that's 3 no ancestry if you look at the fugitive

43:16

you last nite and it didn't work overtime and a few and I'll stop there and and I'd love to chat about this afterwards them at anatomical the 4 questions to a to use it all

43:27

up thank you for giving them so as to why we went in few

43:40

so questions the hi and are

43:57

so we have seen the all the nodes of the numbers 0 what are you put nodes are goes the individual pixels those are yeah those are the individual pixel

44:07

so if we have an image of 28 by 28 . 784 pixel the think so you have an input layer of 784 you can choose other ideas you can say I want to rescale everything or I might want to have different features as inputs you can have you can do things like that and that this is a very simple example very naive example which takes the raw pixels and that it works but you use people will do other things to perhaps if they know something about the data working with they might say I think another feature is more likely to be a factor in the answer and much to use that to train the network instead so might use color they might use alpha values they might use something else

44:55

so any other questions OK thanks a lot at the time

00:00

Umsetzung <Informatik>

Selbst organisierendes System

Offene Menge

Open Source

Güte der Anpassung

Vorlesung/Konferenz

Schlussregel

Remote Access

Neuronales Netz

Term

Computeranimation

Neuronales Netz

01:05

Metropolitan area network

Expertensystem

Soft Computing

Rechter Winkel

Baum <Mathematik>

Code

Computeranimation

Neuronales Netz

01:53

Bit

Diskrete-Elemente-Methode

Punkt

Momentenproblem

Computeranimation

Demo <Programm>

Neuronales Netz

02:31

Demo <Programm>

Programmbibliothek

Zahlenbereich

Zeiger <Informatik>

Computeranimation

Demo <Programm>

Neuronales Netz

03:11

Metropolitan area network

Punkt

Menge

Rahmenproblem

Rechter Winkel

Zahlenbereich

Vorlesung/Konferenz

Computeranimation

03:45

Soft Computing

Rechter Winkel

Digitale Photographie

Stichprobenumfang

Wort <Informatik>

Audiodatei

Quick-Sort

Computeranimation

Neuronales Netz

04:41

Hypercube

Momentenproblem

Gruppe <Mathematik>

Singularität <Mathematik>

Spieltheorie

Extrempunkt

Computeranimation

Metropolitan area network

Mereologie

Vorlesung/Konferenz

Neuronales Netz

Einfügungsdämpfung

Neuronales Netz

Gammafunktion

05:18

Programmiergerät

Soft Computing

Funktion <Mathematik>

Ein-Ausgabe

Vorlesung/Konferenz

Wort <Informatik>

Rechnen

Ein-Ausgabe

Computeranimation

Neuronales Netz

05:56

Metropolitan area network

Umsetzung <Informatik>

Informationsmodellierung

Reelle Zahl

Zahlenbereich

Vorlesung/Konferenz

Mailing-Liste

Computeranimation

Fehlermeldung

06:54

Parametersystem

Wellenpaket

Kreisfläche

Gruppenoperation

Mailing-Liste

Computeranimation

Virtuelle Maschine

Informationsmodellierung

Flächeninhalt

Vorlesung/Konferenz

Elektronischer Programmführer

Fehlermeldung

Neuronales Netz

08:00

Informationsmodellierung

Diskrete-Elemente-Methode

Punkt

Schlüsselverwaltung

ATM

Mathematische Modellierung

Vorlesung/Konferenz

Punkt

Computeranimation

Fehlermeldung

08:34

Parametersystem

Dicke

Programmfehler

Laufwerk <Datentechnik>

Cliquenweite

Gruppenkeim

Vorlesung/Konferenz

SIDIS

Cluster <Rechnernetz>

Computeranimation

09:17

Trennungsaxiom

Objekt <Kategorie>

Metropolitan area network

Programmfehler

Prognoseverfahren

Laufwerk <Datentechnik>

Güte der Anpassung

Datentyp

Wort <Informatik>

Cluster <Rechnernetz>

Gerade

Computeranimation

10:20

Metropolitan area network

Informationsmodellierung

Programmfehler

Laufwerk <Datentechnik>

Raum-Zeit

Computeranimation

11:04

Trennungsaxiom

Parametersystem

Dicke

Punkt

Mathematisierung

Ablöseblase

Cliquenweite

Ruhmasse

Computeranimation

Gradient

Metropolitan area network

Prozess <Informatik>

Parametersystem

Vererbungshierarchie

Lineare Geometrie

Vorlesung/Konferenz

Gerade

12:53

Bit

Punkt

Güte der Anpassung

Vorlesung/Konferenz

Bitrate

Teilbarkeit

Gerade

Fehlermeldung

Richtung

13:30

Bit

Relativitätstheorie

Ein-Ausgabe

Binder <Informatik>

Rechnen

Komplex <Algebra>

Teilbarkeit

Computeranimation

Portscanner

Metropolitan area network

Informationsmodellierung

Reelle Zahl

Konditionszahl

Streuungsdiagramm

Gammafunktion

Neuronales Netz

15:01

Physikalischer Effekt

Green-Funktion

Vorlesung/Konferenz

Wiederherstellung <Informatik>

Binder <Informatik>

Ein-Ausgabe

Quick-Sort

Gerade

Computeranimation

Funktion <Mathematik>

15:47

Lineare Abbildung

Bit

Punkt

Natürliche Zahl

Klasse <Mathematik>

Computeranimation

Metropolitan area network

Multiplikation

Variable

Knotenmenge

Algorithmus

Digitale Photographie

Notebook-Computer

Punkt

Softwareentwickler

Gerade

Verschiebungsoperator

Schlüsselverwaltung

Ein-Ausgabe

Quick-Sort

Linearisierung

Portscanner

Soft Computing

Rechter Winkel

Ein-Ausgabe

Neuronales Netz

Informationssystem

18:21

Task

Metropolitan area network

Punkt

Natürliche Zahl

Vorlesung/Konferenz

Datenstruktur

Term

Quick-Sort

ART-Netz

Hilfesystem

Computeranimation

19:35

Softwaretest

Soundverarbeitung

Lineares Funktional

Schwellwertverfahren

Punkt

Kreisfläche

Natürliche Zahl

Schaltnetz

Zahlenbereich

Element <Mathematik>

Physikalisches System

Ein-Ausgabe

Quick-Sort

Computeranimation

Metropolitan area network

Soft Computing

Knotenmenge

Funktion <Mathematik>

Ein-Ausgabe

Polygonnetz

Sigmoide Funktion

Neuronales Netz

Message-Passing

Neuronales Netz

21:59

Einfach zusammenhängender Raum

Soundverarbeitung

Lineares Funktional

Parametersystem

Schwellwertverfahren

Gewicht <Mathematik>

Ein-Ausgabe

Binder <Informatik>

Computeranimation

Metropolitan area network

Knotenmenge

Zustandsdichte

Koeffizient

Kontrollstruktur

Vorlesung/Konferenz

Gerade

Gammafunktion

Funktion <Mathematik>

Neuronales Netz

23:19

Einfach zusammenhängender Raum

Knotenmenge

Punkt

Gefrieren

Vorlesung/Konferenz

Binder <Informatik>

Computeranimation

Neuronales Netz

24:05

Folge <Mathematik>

Gewicht <Mathematik>

Funktion <Mathematik>

Sigmoide Funktion

Ein-Ausgabe

Gewichtete Summe

Vorlesung/Konferenz

Kantenfärbung

Rechnen

Binder <Informatik>

Computeranimation

Neuronales Netz

24:40

Metropolitan area network

Matrizenrechnung

Multiplikation

Knotenmenge

Gewicht <Mathematik>

Mustersprache

Matrizenrechnung

Zahlenbereich

Vorlesung/Konferenz

Rechnen

Ein-Ausgabe

Computeranimation

Neuronales Netz

25:28

Matrizenrechnung

Hardware

Schreiben <Datenverarbeitung>

Zahlenbereich

Rechnen

Ein-Ausgabe

Graphikkarte

Term

Homepage

Multiplikation

Soft Computing

Rechter Winkel

Programmbibliothek

Vorlesung/Konferenz

Neuronales Netz

26:15

Parametersystem

Matrizenrechnung

Informationsmodellierung

Funktion <Mathematik>

Matrizenrechnung

Kontrollstruktur

Vorlesung/Konferenz

Neuronales Netz

Computeranimation

Fehlermeldung

Neuronales Netz

26:58

Parametersystem

Bit

Gewicht <Mathematik>

Binder <Informatik>

Computeranimation

TUNIS <Programm>

Flächeninhalt

Funktion <Mathematik>

Vorlesung/Konferenz

Ordnung <Mathematik>

Neuronales Netz

Neuronales Netz

Fehlermeldung

27:53

Matrizenrechnung

Gewicht <Mathematik>

Punkt

Matrizenrechnung

Term

Raum-Zeit

Computeranimation

Richtung

Metropolitan area network

Multiplikation

Variable

Knotenmenge

Vorlesung/Konferenz

Bruchrechnung

Backpropagation-Algorithmus

Heuristik

Binder <Informatik>

Rechnen

Flächeninhalt

Funktion <Mathematik>

Rechter Winkel

Ablöseblase

Wort <Informatik>

Fehlerfortpflanzung

Neuronales Netz

Fehlermeldung

Zeichenkette

30:10

Bit

Gewicht <Mathematik>

Schlüsselverwaltung

Matrizenrechnung

Binder <Informatik>

Computeranimation

Portscanner

Metropolitan area network

Knotenmenge

Verschränkter Zustand

Verschlingung

Punkt

Funktion <Mathematik>

Fehlermeldung

31:02

Algebraisches Modell

Metropolitan area network

Lineares Funktional

Flächeninhalt

Extrempunkt

Gauß-Fehlerintegral

Vorlesung/Konferenz

Gradient

Quick-Sort

Computeranimation

Funktion <Mathematik>

31:47

Mapping <Computergraphik>

Lineares Funktional

Punkt

Extrempunkt

Vorlesung/Konferenz

Computeranimation

Richtung

32:29

Lineares Funktional

Punkt

Schlüsselverwaltung

Extrempunkt

Ruhmasse

Gradient

Computeranimation

Gradient

Richtung

Portscanner

Loop

Metropolitan area network

Gauß-Fehlerintegral

Vorlesung/Konferenz

Punkt

Neuronales Netz

33:16

Arithmetischer Ausdruck

Gewicht <Mathematik>

Verschlingung

Extrempunkt

Gauß-Fehlerintegral

Gradientenverfahren

Vorlesung/Konferenz

Neuronales Netz

Computeranimation

Fehlermeldung

34:04

Lineares Funktional

Expertensystem

Klasse <Mathematik>

Bit

Gewicht <Mathematik>

Kalkül

Web log

Gradient

Gesetz <Physik>

Code

Computeranimation

Gradient

Metropolitan area network

Druckverlauf

Wellenpaket

Kettenregel

Bitrate

35:00

Metropolitan area network

Klasse <Mathematik>

Programmiergerät

Gewicht <Mathematik>

Klasse <Mathematik>

Programmbibliothek

Anfangswertproblem

Datenstruktur

Computeranimation

Neuronales Netz

35:39

Matrizenrechnung

Lineares Funktional

Gewichtete Summe

Graph

Zahlenbereich

Systemaufruf

Kardinalzahl

Computeranimation

Multiplikation

Programmbibliothek

Vorlesung/Konferenz

Notebook-Computer

Kurvenanpassung

36:31

Turnier <Mathematik>

Matrizenrechnung

Expertensystem

Lineares Funktional

Wellenpaket

Gewicht <Mathematik>

Zahlenbereich

Mailing-Liste

Extrempunkt

Ein-Ausgabe

Rechnen

Term

Computeranimation

Metropolitan area network

Bildschirmmaske

Multiplikation

Knotenmenge

Funktion <Mathematik>

Vorlesung/Konferenz

Kurvenanpassung

Bitrate

Widerspruchsfreiheit

Gammafunktion

Neuronales Netz

Funktion <Mathematik>

37:34

Matrizenrechnung

Gewicht <Mathematik>

Wellenpaket

Kalkül

Euler-Winkel

Code

Computeranimation

Metropolitan area network

Multiplikation

Arithmetischer Ausdruck

Menge

Wellenpaket

Vorlesung/Konferenz

Fehlermeldung

Neuronales Netz

38:37

Resultante

Softwaretest

Pixel

Wellenpaket

Freeware

Minimierung

Zahlenbereich

Quick-Sort

Computeranimation

Metropolitan area network

Knotenmenge

Funktion <Mathematik>

Offene Menge

Datenerfassung

Bildgebendes Verfahren

Neuronales Netz

Funktion <Mathematik>

Aggregatzustand

40:18

Resultante

Wellenpaket

Gewicht <Mathematik>

Extrempunkt

Wasserdampftafel

Zahlenbereich

Ungerichteter Graph

Bitrate

Code

Computeranimation

Stochastischer Prozess

Metropolitan area network

Trigonometrische Funktion

Knotenmenge

Gradientenverfahren

Gerade

Gammafunktion

Neuronales Netz

41:29

Flächeninhalt

Web log

Digitaltechnik

Magnetbandlaufwerk

Quick-Sort

Computeranimation

42:04

Metropolitan area network

Demo <Programm>

Wellenpaket

Rechter Winkel

Digitalisierer

Zahlenbereich

Computeranimation

Neuronales Netz

42:55

Metropolitan area network

Vorlesung/Konferenz

Ausgleichsrechnung

Computeranimation

43:35

Metropolitan area network

Subtraktion

Knotenmenge

Pixel

Jensen-Maß

Zahlenbereich

Vorlesung/Konferenz

Kantenfärbung

Ein-Ausgabe

Bildgebendes Verfahren

Teilbarkeit

Computeranimation

Neuronales Netz

### Metadaten

#### Formale Metadaten

Titel | A Gentle Introduction to Neural Networks (with Python) |

Serientitel | EuroPython 2016 |

Teil | 165 |

Anzahl der Teile | 169 |

Autor | Rashid, Tariq |

Lizenz |
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben |

DOI | 10.5446/21229 |

Herausgeber | EuroPython |

Erscheinungsjahr | 2016 |

Sprache | Englisch |

#### Inhaltliche Metadaten

Fachgebiet | Informatik |

Abstract | Tariq Rashid - A Gentle Introduction to Neural Networks (with Python) A gentle introduction to neural networks, and making your own with Python. This session is deliberately designed to be accessible to everyone, including anyone with no expertise in mathematics, computer science or Python. From this session you will have an intuitive understanding of what neural networks are and how they work. If you are more technically capable, you will see how you could make your own with Python and numpy. ----- Part 1 - Ideas: - the search for AI, hard problems for computers easy fro humans - learning from examples (simple classifier) - biologically inspired neurons and networks - training a neural network - the back propagation breakthrough - matrix ways of working (good for computers) Part 2 - Python: - Python is easy, and everywhere - Python notebooks - the MNIST data set - a very simple neural network class - focus on concise and efficient matrix calculations with bumpy - 97.5% accuracy recognising handwritten numbers - with just a few lines of code! |