Merken

# Deep Learning with Python & TensorFlow

#### Automatisierte Medienanalyse

## Diese automatischen Videoanalysen setzt das TIB|AV-Portal ein:

**Szenenerkennung**—

**Shot Boundary Detection**segmentiert das Video anhand von Bildmerkmalen. Ein daraus erzeugtes visuelles Inhaltsverzeichnis gibt einen schnellen Überblick über den Inhalt des Videos und bietet einen zielgenauen Zugriff.

**Texterkennung**–

**Intelligent Character Recognition**erfasst, indexiert und macht geschriebene Sprache (zum Beispiel Text auf Folien) durchsuchbar.

**Spracherkennung**–

**Speech to Text**notiert die gesprochene Sprache im Video in Form eines Transkripts, das durchsuchbar ist.

**Bilderkennung**–

**Visual Concept Detection**indexiert das Bewegtbild mit fachspezifischen und fächerübergreifenden visuellen Konzepten (zum Beispiel Landschaft, Fassadendetail, technische Zeichnung, Computeranimation oder Vorlesung).

**Verschlagwortung**–

**Named Entity Recognition**beschreibt die einzelnen Videosegmente mit semantisch verknüpften Sachbegriffen. Synonyme oder Unterbegriffe von eingegebenen Suchbegriffen können dadurch automatisch mitgesucht werden, was die Treffermenge erweitert.

Erkannte Entitäten

Sprachtranskript

00:01

with the so this is the other 5 unintelligible talk I think you very much for coming to this sections of right after lunch and I know of you many of you are going to start getting sleepy right around the 40 In March 30 minute mark so do your best to stay awake all do my best to keep you awake so what's going to work together to get to the talk on so if the slides will invariants mind just to introduce myself my name is the lowest common I'm developer advocates that that at Google I worked

00:44

on the cool cloud platform genes so that kind of encompasses all rule cloud platform so few people or not you people but if

00:54

you you guys are familiar with some things like average in the work of a computer engineer that sort of thing the that's what we would call part from his and so on and on and on on Twitter and union Louis I I've been tweeting like source throughout the compensation deal finally fairly easily on on Twitter on and just a little bit more of a background about myself I based in Tokyo Japan was listed on for about 10 years and then I think it kind of

01:28

active in the the present community there as well so far and 1 of the the for people who kind of founded the

01:37

department at the conference which is about a 600 person conference I just to give you an idea of the size and going to be having the conference in September in the 3rd week of september I believe it's from the 20th to the 24th going on and if you look at my country the and you can find out register I think there are as of now something like 20 slots left so well so hurry up and also of prettier for young enthusiastic about other kind of communities so he could go community as well as the other open-source kind of projects communicable like like companies and and Dr. in this type of containerization and things so that's that type of thing that you cannot expect to hear from you if you follow on Twitter on so 1st of just is a kind of a background they wanted to go over a kind what deep learning is and I'm going to give a very high level not known as a high level but they sort of a quick overview of what there is like how many of you guys that went to the talks earlier than they know about the people learning to quite a few of you on Monday trying my best to to kind of build on that 1 but there may be a little bit of overlap so what are we talking about when we talk about it learning talking in terms of deeper learning we're talking about a specific type of machine learning which is using neural networks and neural networks or a way of doing machine learning where you build this kind of network of nodes of interconnected by so you

03:26

essentially give something like this this cat picture of you change the pixels into a a kind of numerical representation you pass that through as far as the input layer into the good networks and each of these nodes internal nodes will take the the internal the values from your inputs and some operation on them and eventually give you the the output

03:55

so these are typically organized in layers so you can see this blue 1 is the is the input

04:00

layer and the one-to-one there is a is what's called a hidden layer of so if you think of a there is no network is kind of a black box and the hidden layers the layers that are actually inside do the operation and so each 1 of these little nodes this is that and there's some sort of operation on the on the inputs and that's called basically an activation function on and then each of these are kind of links together using a weighted connections so each of these little lines connecting the the layers will be weighted to indicate latest strength between each of the layer so

04:44

what are the little neural networks before so no essentially good for a kind of classification and regression problems so these are very wide class of problems that that you can apply machine learning to on so classication is basically putting things into buckets so you can have

05:01

like a bunch of pre-defined but it's like a B C and then you get something quite say which but it doesn't go in and then you basically put to the network and get a probability that goes in a B or C and regression is a little bit more somewhat more complicated in that you get instead of a probability that goes into a single bucket like this you know between 0 and 1 you get kind of a a scalar outputs so say you have or you are you get some values into neural network and outputs you wanna say like is the temperature so like from from you know say 0 kelvin to some value or some some temperature that's more like a scalar or that would be used could be solved by regression of talking mostly about about the classification problems but regression is also something that neural networks to prove that so what is was that actually look like that so here's a

06:02

a a little demo that available what playground that content flowed out word this is like a little demo that allows you to kind of look into a neural network and can get an idea of what's going on so here we have some input features so these are some values that you add the input into the network and then you have some hidden layers in the middle and then you get some sort of and so if you went to some of the earlier presentation this or something similar to this we have saved a on mines in London like say that you have like the weight and height of a person and then you have 2 different categories here like these ones ones are needy children and these blue ones or say adults so if you want to classify new pieces of data coming into here you never you could say save you training network to do this but this is really you know very easy to do like this is essentially a linear classification problem where a linear regression and classification problem where you can just draw a line in between the 2 ends gets a way of predicting between to predicting between it and let's say you have something bit more complicated let's look at 1 where the data 1 category is completely encircled by the others so if we were to do something like this and then tried to train using just some you know X x and y inputs this this we actually never basically never converges never figure out how to do this problem so we can do things like that as the the input layer and that will essentially do the the this kind of linear classification multiple times as they can say let's do this 1 time I will see that it basically creating 1 line here so everything on this side of it all classifiers lines in the side of class-wise blue but then we add like new layers with not layers but new nodes we can actually see that it it gets a little bit more complicated so we could say it now figures out to lines for users to lines and then average the results together you see in the 1 node done 1 kind of linear regression and 1 noticed another and then when you combine together it kind of makes this band here and what's not just 1 that 1 now if we do with 3 we can actually combine the results 3 times and we get kind of a triangular Toeplitz structure select as we add this kind of nodes and and layers we can do things that are more and more complicated OK so that's great so late but how do we classify something that looks more like this so this is kind of a spiral looking training in these the spiral blue and this policy 1 inch this is something that's only quite a bit more difficult and we can't necessarily classified if using something weight just X and Y and so you can imagine the sign will be good or x squared but even still like we don't really get very very good output I just by having a very shallow kind of network with just 3 nodes so in order to actually make this a little bit more complicated or there to these more complicated problems so we need a much more complex than that so for something like this this may or may not actually converge on but it's getting there so so right now this is actually not terribly stable but it will stabilize like so once you if you have this kind of more complicated networks that you can kind of put together you can actually start solving more more complex problems and so all talk a little bit about why that's important a little bit later but you can see that each image of these individual nodes like has their own kind of In addition to this to the to the final outputs and then each of these little lines here we that show the weights for these blue ones are positive news 1 2 and the negative such that show was that the negative ones are actually inverse inverse relationship so With this inverse relationship you can essentially just you know would firstly the wanted a blue the theory get the right type of outputs but on the center you can have this kind of positive and negative weighted connections between the different nodes want to turn this off so it doesn't burn my CPU and then that year

10:50

of the way that that was basically just a way of getting to

10:53

understand no networks start actually using tens of fluid under the hood it's like all the new jobs the browser but from it's essentially a way of like kind of getting familiar with more familiar with the real networks so is a neural networks so neural network is essentially that when you break down is some essentially a pipeline of basically taking something like a matrix what is essentially called attention and putting it through like this pipeline of operations and so you can imagine that each of these is a like a matrix multiplication type of

11:31

problem over think of a function where you take of 1 matrix multiplied by another matrix multiply vinyl matrix and another and another and another and another and then eventually you get out of the tensor that represents the output and for a particular for your particular problem and this is basically very loosely modeled after house how the brain works

11:56

only in how the of the the individual nodes like have have with a kind of strength the weight that in between the neurons in your in your brain have a certain weight between but from the from a practical point of view you're essentially doing matrix operations on on and a bunch of times in order to do some sort of predictions assignments and what a tensor on and this is where the this is what were tens of all gets its name from but a tensor is not

12:30

something that people necessarily think of very often heard on encountering too often on unless you're a machine learning type of person but on most people are familiar with things like vectors and matrices and tensors essentially a generalized

12:45

version of that so you can imagine like this kind of like to be like

12:49

Euclidean space our 3 D space and then you have some sort of value out here in the space and so for something like a factor you know you would have like you know it to the vector hand that could be saved represented by a a single array in a programming language for a matrix which is a you know a 3 dimensional vector a two-dimensional vector on but attention is essentially a generalized version where you have this n-dimensional are type of active so you could have like

13:21

any number of dimensions of of dimensions from so this could be 1 dimension for each type of features that you're actually add into the into the into the network and you can essentially do the same sort of operations on the tensor as you would find a say a matrix so like matrix multiplication or matrix addition that's so hold the power of the signal network works is that you would have this kind of connected nodes so this is our

13:58

input vector input tensor with X 1 X 2 X 3 and then we have the weights on cancer word that is represented in that you can then multiplying and then finally we add the resultant result of biases in the form of a tensor and then softmax it's to get the out this is a very very basic 1 uh 1 layer network but you you can think of these individual not these individual on weights or whatever or not each matrix multiplications but essentially this this matrix times this matrix is this kind of interconnected makes this kind of interconnected time and so this is how likely the if you want to use only Audio talks most of the the operations are performed in this way where you have uh the input X times W which is the weights of plus the which is the biases and then you can do that multiple times for a for each layer and and so these are

15:11

basically just multiplications and additions and then we have

15:14

this kind of softmax thing at the end of the softmax is essentially just the form of a in way of normalizing the data you want to consult seals typically the these at the very end of the networks on so what happens is after you've gone through this network these outputs what it would be at this level is that you would have some sort of value like 50 this was like 50 isn't like 20 someplace . 3 2 you know and like so you don't really get an idea of like what that actually means that's kind of these values are kind of a relative value for your actual network so when you put this through the softmax function this is largely normalized it to a value between 1 and 0 I see

15:54

if you get essentially a prediction output so and then these these individual values would represent about whether the the percentage say of that a particular value goes into a particular part a this is a cat dog and this is the human

16:11

and we put in an image value the output might be like that it's . 9 9 % certain it's a cat and . 0 1 % that the dog in . 0 0 1 per cent are certain that to human which essentially means that the

16:32

so that's great so that's knowledge we

16:33

actually like prediction so let me say input inputs we go through all these operations and get some sort of predictive right but how we

16:41

actually train on model so model is trained in in this way where you have you use a method called back representation of which was part of some of the earlier talks on but essentially what you have is you have this this year is the neural network as we've been talking about before so here's like said 1 layer appears in the 2nd layer uses softmax and here's our help with 1 and we actually go through here and we do the prediction on but then what we do is we use we use test data to actually as we put it through our our network so we have some test it says here's the here's the actual data user the capture of this is a catch so you have the actual value of of the actual

17:23

output expected outputs and the actual on the test data of associated with each other so you know which ones are caricatures which when the dog pictures and and so what we do is

17:35

to put this say that capital through here and then it comes out with it is with the result and what we do is we take that result and we and the expected value and then we find essentially the of the difference between those 2 value so safe if it came out that it was you know . 86

17:57

% certain that was cat but we know that it's a hundred per cent certain it's a we want to be able to knowledge of our network in the direction of actually determining with 100 and accuracy there's a capsule figure this and use what's called a loss function to

18:16

find the differences so selected typical loss function might the cross entropy but there are a number of other loss functions that you can use depending on you the situation and then you go through and these are the new kind of optimizing results by using something like gradient descent is were also talked about a little bit earlier on multiple innovation more in general about these kind of optimization or is especially gradient descent on but essentially what you do is you put through this this optimization of function and then back propagate all the values into the weights and biases for each individual layer so these these this way 1 way to bias 1 way to invite to are actually the weights and biases from their use in the network here and so what you're doing here is your essentially backpropagating of these values and updating the weights and biases well and so on kind of knowledge in the network in the direction of actually doing doing is giving you the

19:21

proper help and then you do this that essentially many many many times you training it over and over and over and over again and it is eventually nodes is in the direction of of the a very but actually at least that's the that's the theory so this doesn't always work like that but in general that's the idea behind such kind of like of relative overview of how the what neural networks are like so why are we actually talking about this so you know 1 of the earlier talks mentions and things like that so these like image next is a you know a famous open datasets also offered for machine learning and we get like say like 25 per cent error rate in like 2002 on but so essentially the reason why we're talking about the standard did not all sudden is because of people have started to get very much better at the end trading these neural networks on and this is because of a number of kind of breakthroughs in terms of training these networks to do things that are actually practically useful so you can think of the the the quality of

20:46

the of the neural network kind of like this so I kind of traditional deeper traditional learning algorithms we kind of as you get more data would kind of increase in performance kind of level off very quickly and then you would have like smaller small they're elaborates which also kind of level off quickly and so essentially what people did was they would you know training the amount of data to about here for a given of certain amount data to right here and then they would basically yeah they wouldn't be adding more data would

21:20

actually make it much better so they will essentially be able to stop right here but we have since found these kind

21:27

of these you know network methods that allow us to scale the the learning much better so as we throw more data and the problem they actually get quite a bit more sophisticated and have quite a bit better performance so we've been able to create these large deep neural

21:45

networks there will continually get better as we give it more and more data and with that comes like other problems which I'll talk about in a 2nd but essentially these these light medium and large real networks have become possible recently and so

22:10

here is a model of what's the this is a global the grown on the network on that was used in this is called this is essential in such a model that was trained on the woman and so what this is essentially doing it's like labeling vectors labeling images you can think it is if 1 of these has been say a matrix multiplication or some sort of operation on a matrix and it goes through several different kind of layers and then eventually gives you are how tensor that tells you the labels so this is what we mean when we talk about deep now networks of networks that are essentially have like many many many many

22:56

layers before they actually give you this output and by adding these layers we actually can start getting more and more complicated you solving more more complicated problems and actually getting a pretty good results about with but this this is a problem where we have you know you can imagine that each 1 of these is a matrix multiplication and these cancers might be you know a large image like a megabyte or something and you're changing that in the tensor and then doing a matrix multiplication on it you can imagine how many actual new operations you have to do In order to actually train this work to do even prediction even just once so you have to do this many many many many times over in actually trying networks and so what people do is these GPU is on and these 2 views of

23:45

very good at high priority but still you're essentially waiting for like 2 weeks or sometimes even months for the results of actually 1 1 single training right so what people started in the late eighties like supercomputers In order to train models faster on but still this is a problem because not everybody has access to supercomputer how many of you guys have access to supercomputer somebody does that's that's most I've ever seen those like 3 or 4 things on so how much the patient

24:19

had away so there is this arrow supercomputers there or something that you have the least time on little like the older mainframes of all the you know where you had the likely sometimes you know 738 and you know in the middle of the nite or something like that and you take tons of money for them so they're not exactly the easiest the best way and we want like you know the ideal thing is to deal for everybody to be a little machine learning so what you need is a kind of distributed kind of training on and so they actually been able to do that and so we use it for a lot of practical applications

24:55

things like and photos and like detecting text in street view images on so there's a lot of kind of exciting things that are going on and essentially recently we've these kind of refusal on quite a lot of activity and Google as this is a number of other projects the internally will then used by that use learning machine learning this is just the number of directories that contain a model description file competency from 2004 we got this kind of practice the growth and

25:31

yellow by distributing it we've been able to do you know much much faster except book so now I'd

25:38

like to talk about and floor itself and so flow is an

25:42

open-source library it's a generic very general purpose machine learning library for particularly for doing all networks we're also and expanding its too encompass other

25:55

types of machine learning on but said it was open source environment has applied on energy is buying internally and will for a lot of our internal projects so it supports a number of things like you know are you know this kind of

26:14

flexible and intuitive construction no to basically be able to do a lot of things or in an automated way on and you can see it supports training on things like CPU and GPU successor on 1 of the nice things that you define this kind of networks in Python so far before I kind

26:35

of to looking at what intervals like some of the core concepts you have a graph so tender flows the tuning of tens flow comes from the idea of like taking chances and having them close to a flow graph of the directed flows had the graph

26:52

so it's a graph is the representation of that these are the operations of the actual nodes of the the operations that you do

26:59

I intensity the data that actually passed through the through the on the network and then we have other types of kind of structure so we have these like the idea of these constants which can be something that doesn't change on

27:15

but then you have things like placeholders these are basically inputs into our into our network these these variables some variables of things that actually change during the training so these are the things that you usually use for your weights and biases etc. and session is something that actually encapsulates the on the overall make connection between tens of score and how you actually the models that you find so I should mention that tend to flow

27:52

is a library that is based on the same sort of concepts of many other libraries

27:58

consigned to libraries we have a part of Python interface EPI and then it has a kind of a C + + horror that enables you to do these kind of very fast operation so we are actually during training you don't you're not actually going through the pipe and the so these are you know non-exclusive

28:21

area non exclusive list of all the operations you can do with tens of flows of things like math addition subtraction multiplication division of of these tensors matrix operations stateful kind of operations experiments so

28:41

let's actually what this looks

28:44

like so when you run through so this

28:54

is this is a stupid notebook so how many people have heard of Jupiter

28:58

use Jupiter 0 OK how many people have been asked that more than 5 times this conference on river on so is just assume that you guys even just kind of go from there let me actually just we

29:16

start this kernel here yeah yes this is a python 2 1 because tens of flow also supports Python 3 if I remember right but this particular example Python T and so on of flow is is pretty easy to get started there's like this is this is just using the as a kind of amnesty examples of what the so Mr. mission machine was talking about the began this example earlier today raise essentially a bunch of images that are kind kind of handwritten numbers and you dispose the artist to determine which type of which number is actually present in the image so the training images look something like this we have 55 thousand images and they're all in this species long a ready of and each 1 has 784 pixels and they're basically models monochrome selective just

30:21

black and of but the the and

30:26

so if you look at the shape of that unit the 55 thousand the size range on with the 784 pixels but if you look at the face of the images they're essentially each value in it is the this is each 1 of the images in Europe linear kind of a two-dimensional array and each of these values is a value from 0 to 1 of those essentially how . that particular pixel is so some of these like . 2 3 which is kind of a white bread all the way up to 1 so that's

31:03

essentially what the data looks like so

31:06

that's how we've actually represented here like if you had a color energy we need to represent a little bit differently but that's essentially how we're doing it in this case and then this is just using this is just showing an example value so using that that plot line so this is just 1 of the input images so that's essentially with the training data looks like but then we have these training labels that are associated with each image that says that's basically 10 you know or a or a vector of size 10 with you know a bunch of zeros in it and a 1 in the right location that's that indicates the number of the for that particular image so for this image we have a page here so if we look at it between labels the shape of that that's 10 10 size and then if we look at this the particle 1 for this case we can see that the 1 is in the of what is this is like 0 to 9 or something in this particular Thompson each column I think this is like this 0 free is actually pretty sets from 0 to 9 so that's that's essentially what this is these actually 100 actors in where you have 0 in all the values except for 1 this is used often in training data but the data that should get out of it is actually

32:36

similar today's this except for it will be a bunch of values from 0 to 1 essentially a probability and so here's some of

32:46

the images by itself as some images of their connection earlier on but so wanted training and you can kind of get these so you can train it to show these different of set these weights and biases so that make individual pixels like will indicating whether it's a the particular number so in this case we're actually using a very simple no

33:12

network which will kind of which

33:14

with just 1 layer which will work this way but essentially give you see pixels in these blue areas that probably is 0 and if it's in the if you if there's any pixels in here in the red area that it's probably not as 0 and then it's like basically aggregates the the probabilities whether this is 0 and you can kind of see that in in many the other ones so like this one-to-one you see pixels in this area with it to you it's like in this area and 3 in this area so they look similar to the actual value the numbers you looking for so I actually don't think so the next 1 and this is actually as defining our of our networks so here were defined importing tens of flow reason the placeholder that I talked about earlier this is our input into the In into the neural network and we had it is the size of 784 so this is the size of the number of pixels and then we have these weights and biases as variables which can be updated as we train the model on in here is actually where we define our of our network so here is this is just the same single layer network for doing you can define its very similarly in Python to the way that you would do in

34:35

mathematics so here we can say were doing a matrix

34:39

multiplication on the inputs times the weight and adding it to the bias of variable and then doing a softmax on it at the end and then kind the flow internally will take these and build our kind of of

34:54

our like data flow diagrams data flow you can model representation so once we have that out we can then use that to you today in the train model

35:08

so this is actually our our our neural network and then we have a placeholder for the outputs this why prime and we've defined a cross entropy function here is our loss function and then we can basically plot of pretty into this gradient descent optimize and optimizing using cross entropy and this will then create our kind of training step so this isn't this encompasses the entire of the entire no number plus the training that we need to do and as grid like like the some of the other explanations from the talks gradient descent is essentially a way of kind of nudging our neural network in the direction that we wanted to so and I think 1 of the the talks I talked about using going down and not I'm using a single little you flash later a

36:02

torch and then kind of just going a little bit at a time I don't know of essentially that's the idea you're essentially going down and moving it in the

36:10

direction toward Towards a in minimum to actually minimize the loss so that the same that this altitude would be the the error the loss generated by the loss function and then we basically noted in the direction where the lost due to minimize and then essentially do that just over and over and over again so each 1 of these of the training epoch as we're going down the drain descent optimizes so here we're going to but we're going to train a thousand times on a particular piece of data and so what's great is that we can also do this kind of mini-batch training which is on way of you basically take just a small subset of the total

36:59

training data so we're not actually training everything'll time on every single piece of the training data we're actually training on a random randomly selected facts of this case 100 I think over here on elements and we're doing that essentially a thousand times so this is the thing we longer than usual on and so that what's good

37:22

about that is that you have to train on the entire training data you can essentially do something that's you basically take a randomized in a subset of the training data and that's essentially the same thing you do

37:36

when you do like say a statistical survey we asked a 1 2 people so you basically get a is this significantly are the a a representative sample of the data OK so this is actually done

37:53

OK so the answer this I've actually gone through the training and then at the end we can we can actually check the accuracy of our neural networks so this case actually got about 90 per cent which is pretty bad but about so this is a very simple like 1 layer neural network so that's essentially kind of how you can if you use tens of flow you can basically create these the steps to to run through it but all these steps are actually word of the actual computation is done under the hood in part of this if you're in the year of the c plus plus score and that's and have also maps on to it not saturated devices so if you have to views CP is available watching

38:43

map the operations to those particular devices so in this case I'm writing this I might say like I think that

38:49

32 core machine so actually map that so I'll talk a little bit about 10 to the 1 liter let me go back can I go back this effect on the back this is not what it's not the back the

39:17

only thing OK so here I'm going to

39:23

look at a little bit more a little bit more complicated example and where if you get a little bit better accuracy and so forth during summer training really using the same exactly that we did before on both organ actually used to build what's called a convolutional network and this is this is part of a bit about earlier on and so this is this basically allows you to to look at the image like country in part on and basically pick the specific features from

39:59

each part of the image and this helps with things like like say you right the way that I had earlier you know you have the the image and you have like certain if use all pixels in a certain location then that would indicate what number it was about what happens if you write this the the this year or whatever but you actually translated it slightly over a little bit of that would actually change the way that the you know than at that particular network would be very good at figuring out there that I just moved this 0 over a few pixels instead so to stuff like that in this country of convolutional of

40:36

looking at it helps a lot but In this case what we're gonna do is we're going to initialize the weights and biases a little bit different I think this 1 is just doing this kind of a I think this was taking like kind of random random weights to begin with on but here's the kind of conclusion part so essentially what we're doing is we're going over the image and were picking of a particular these are the the 1 with the work of kernels guess over the image and then were kind of building this or the value of this other kind of 10 that indicates that has a particular value for each of these here for each of the of the picture kernels over that of the image and then we can actually work on the this is just picking kind of features of each individual part of the image rather than looking at the whole image or the image as a whole and then we can take I think that the same things and use what's called cooling calling is another kind of the method that you use to basically kind of 1 of the most common examples max-pooling where you take the the individual value from a part of the of the tensor and you pick the maximum value this kind of like this you somewhat of a representation of a particular part of the image as well and of for that altogether into into layer and you can do that like traits several layers of like that look like that security we set full Our 1st convolutional layer by building these spaces the building the weights and biases and then building our of layer here and then the 2nd convolutional layer takes the inputs from the the outputs from the previous layer and thus the same basically the same sort of thing and then at the end we created a this is a densely connected layers so as the sum the previous talks developed with the convolutional layer is not the kind of and connected between the values because they're actually using this kind of translated from all over the image but the final output layer is kind of a densely connected layer which allows you to kind of just a few basically the exact same thing that we did in our previous layer what we're just we don't have we don't have the

43:12

convolutional part and the

43:17

allow us to you get a much better and companies and not really talk about dropped and but and then we have basically the RealPlayer and this is essentially just doing the softmax on the outputs of all of last part of the over last output from the previous layer and then you can kind of tree next to the model this in this particular 1 we're doing this using the same kind of cross-entropy of using Adam optimizer instead of a gradient of the regular gradient descent optimizer and then with those kind of optimization you can kind of get a a much better outputs were a much better performance so here we're actually doing a lot more training on this particular 1 because it's on a deeper network and we can we can train

44:12

iterates feels a lot better or previous 1 if we continue to train him 1 more it probably wouldn't get wouldn't get very much better than 90 per cent of in this case we can train it quite a lot more times but in order to improve the accuracy so actually at about

44:27

20 thousand times on mini batches of 50 and so they will go through this is actually doing this because it's a this takes about 5 minutes or something but to actually run through all that you see from the output can we get about 99 . 2 % accuracy which is a good deal better than 90 90 % right

44:48

so it's a 1 in 10 you know it's like around 1 in a hundred years is is classified incorrectly so you can do things from from very simple networks to more cut much more complex networks so let me go back

45:08

to my so 1 of the other things that you

45:17

can do because flow has this kind of internal representation knowledge about all the graphs

45:22

and everything is working together

45:24

is you essentially a lot like right log output files as as you during training and these can then be read by a by an application called Cancer bore bored so we were obviously very unique nice things the know tend to float into board is here but what this is really what's really cool about this is that we see with this is that you can look at the the on things like the accuracy of the values of the loss functions and

46:04

this look these kind of grasses that as you were training going over the data to kind of see how

46:10

neural network is performing so in this case we're seeing the actual accuracy as were training and so this is 1 of the this is I think on it in or on the this simple version so once we get up to about 90 % we get there pretty quickly but we don't really get very much better as the training data but you can look at like things like the accuracy but you can also look at the actual loss functions this is cross entropy looking at the cross entropy value and that kind of goes down and down and down this should actually be the close the inverse of the and the accuracy but you can also look at many of the other values and this these basically corresponds to the to the variables or the the the

46:54

individual parts of your other than the basically the the values that you have so here cross-entropy was an actual object of Python objects that you can use word that was defined and

47:09

then you can get this kind of log output data so other things like the maximum in and stuff like that are also part of the per cent would the these early kind of

47:24

input images that you can look at but 1 of the other cool things that you can actually look at the graph of the data itself so all of the the the model itself that 2 buildings to here we have a 2 layer so if we have a 2 layer network we can actually just kind of like zoom in and look at the individual pieces the network like the weights and biases and things like that for for individual parts the network and look at things like the drop-down values the the loss function as well so like this this basically gives you from the Python code that we wrote I will give you a full kind of graph representation of the order of the network so that in the case of say something like a very complicated you know the image that thing that I was showing you earlier you know you would see this huge huge graph of 4 and that was generated by

48:21

that but this is really cool because it helps you visualize your of your neural network which is you define the let's go back here have about 10 minutes or so left something like yellow will get there so that is the main difference between this between distributed training and surrogate between

48:54

tend fall in any of the other line of libraries that are out there is that tens of flow was built from the the from day 1 with distributed training in line so essentially change flows built in such a way that we wanted the to productionised cost you actually do practical work

49:13

with our with our with our while with the library with our networks so we want to be able to do to train these faster on and based on the kind of like hardware kind breakthroughs in improvements that we've done in the past we've made in the past we want the older use was those to be able to train models fast attend flow of supports multiple different titles of parallelism some old model

49:40

parallelism which is essentially breaking up the model so each 1 of these machines that takes a different part of the model and you basically feeding through through here I and basically break up the work that way but it

49:57

also supports what's called data parallelism which is

50:00

hopefully and 1 of the slides it

50:03

disappeared so data parallelism is the opposite where you basically break up the data instead what each

50:09

1 of the of the of the machines has a full copy of the model so you basically splitting up the like record 1 through a hundred and sending it to 1 machine and you know 101 through 200 to different machine and then breaking up the model the values of the of the work that way and there's a number of kind of trade between these whether you

50:32

use like you do like a full graph or a subgraph of the model parallelism more synchronous or asynchronous data parallelism this is going to help her like

50:43

yeah there's a kind of these pluses and minuses to each of these and

50:50

so that's kind of there's no like

50:52

similar to this but

50:55

I do know that in will that we use premeditated parallels the use data and pretty much exclusively so terms of flow

51:05

basically supporting a number of ways this these different types of model parallelism

51:11

and and so on they and

51:16

sedated roles and is this 1 OK so this is where you take the data you can split up and each 1 of these replicas has a full has the entire model and then once you've done some training you can pass this to the printer server so this is the thing that holds all the way to the biases so these are updated dull in push this back to the model replicas and then there's like kind of asynchronous and synchronous versions of the appeals and where you're updating the model updating the weights and biases in parallel for your operating under synchronously for

51:49

each of you kind of iteration on this increases is much

51:55

faster but I can kind of added some of some noise here to model because these the parameters account changed midway through can be changed midway through a run whereas in

52:12

synchronization you kind of running the split up data and then you wait until all the models have finished a particular as a free going next 1 and reduce it but will actually make that make it a little further make the training of a

52:25

bit slower so this is a kind of an example of how that would run with tend to flow we have a bunch of workers doing the paralyzing and then you have some kind of parameter servers and then in between the service they at least yeah PC to to communicate

52:44

so why is this kind of data

52:45

parallelism important on OK so let's a that instead of extracting a cat out of on

52:51

your network we got a dog and relate well we want improve our network we want to make that so Our where do

52:59

reflects what we in order to make that actually better I don't know maybe this this is probably a good idea I don't know so we

53:08

do that we make tree can be run this again and we're like OK yeah like this is right nice and it's

53:14

like running on my GPU and on and it comes out it's like it doesn't make it better like track where right right now like ligand back and start over again so you normally run these kind of experiments like you want all the run these experiments over and over again like very quickly you don't wanna have to wait a week in order figure out that your Tweet went well enough and this is this is a problem with with people who are even experts in machine learning is like you basically you have your have experience and you have a literature that you can you can use to kind of figure out you can narrow down what things you might want to tweak but in the end you need to be able to run back and test to see if the data or if the actual tweet that you may improve the doesn't then improves event essentially have to test and this takes time and so that's why it's very important to be able to do this kind of distributed training but when

54:12

the problems is that as you scale the number of nodes like these number of connections the number of connections between your parameters servers in between your workers increases like kind of exponentially and so this doesn't essentially scale you essentially bottleneck on the network

54:26

on because these these guys are talking over TCP and you essentially get kind of like you know enough on the order of milliseconds latency between the between the machines so you essentially need to build this kind of like you need to have like a dedicated network or a kind of dedicated harbor network a lot of people use things like if it abandoned or whatever on In order to make this go faster but this is actually something that really the problem at the moment so 1 of the things that we did Google's like we're releasing cardinality internally what we do is we created we do art history training but we have understood a dedicated network that doesn't use TCP IP and basically skips the whole TCP IP stack and is able to to me have the communication between the the machines run on the order of you of MS Word nanoseconds instead of 2 nanoseconds or microseconds instead of instead milliseconds so this is something that we are planning on making the public as of

55:36

what's called harden which allow you basically run tens of photographs on the inside of the real data

55:44

roles of planning in finally exposing as far as the the idea on dedicated hardware that we use using force on some incentive using these with these these are called tens of processing units I think that what they're calling but essentially what they are at they're they're dedicated hardware used for doing tensor operations so so we're basically being able to like expose those and but to 2 other people so that they can use use that kind of dedicated hardware in order to like kind of do more experience and things like that

56:19

so I think that's all I had on so I want to thank you for coming and spending you know the last hour here that how many of you still awake region hands she so

56:33

about like 70 % you so let us things a lot for coming out definitely check out the tens of flow . org website there's tons of really good examples

56:45

like if you go here and then if you

56:48

look here there's like tutorials and documentation this is actually really really good and has like lot of good examples about how to 100 to use Houston's and especially if you're also a person you know there's different ones for different levels of people as well as how to use in terms of clustering to kind of

57:08

move towards actually production rising you're your models so thanks a lot for coming and and and few yeah Jewish question that we have like 2 minutes so that works all yes sir question has to develop 15 seconds and then you know maybe 30 seconds Is there anything like profiling for this kind of models like do that have an overview of how many multiplication how many parameters that does each block of the flowgraph needs so usually involve like actually time it took to to run it I don't know which if tens of or gives you that think that it probably should if it doesn't uh I don't know is that the end of that actually is but I think that that could be something that you could visualize sense of word as part of the out you basically logic that is as a value that you can view here and have work and then kind of see that you know how each part of the graph performed things like that other questions . 1 right behind you the so the previous talks today mention that you typically have to do some feature extraction before you can actually apply neural networks and I will censor flow Help me speedup my were manually designed to feature extraction or is it that are designed only to do neural networks stuff so at the moment it's mostly geared towards knowledge I mean obviously did you like feature extraction using a separate neural networks so you could do like neural network that does the retraction and another 1 that does the actual like classification let's there is there is some work going on there's like forget what it's called it's like like tens of flow wide something like that I think it's called itself is essentially instead of like having deep neural networks and the idea is that you have these like kind of more standard type of machine learning algorithms and so I think that there is work going on there too you like incorporate more standard machine learning algorithms you can do that sort of feature extraction beforehand and stuff like that but it's kind of ongoing work you might try and search about Center white I haven't played with it personally so I can't really give you details yes thanks a lot

00:00

Rechenschieber

Softwareentwickler

Rechter Winkel

Invarianz

Applet

Schlussregel

Garbentheorie

Softwareentwickler

Cloud Computing

Computeranimation

00:51

Umwandlungsenthalpie

Bit

Datennetz

Open Source

Gebäude <Mathematik>

Technische Informatik

Quellcode

Term

Quick-Sort

Computeranimation

Übergang

Virtuelle Maschine

Knotenmenge

Twitter <Softwareplattform>

Mittelwert

Mereologie

Datentyp

Projektive Ebene

Algorithmische Lerntheorie

Drei

Neuronales Netz

03:25

Einfach zusammenhängender Raum

Lineares Funktional

Nichtlinearer Operator

Pixel

Datennetz

Blackbox

Selbstrepräsentation

Kardinalzahl

Ein-Ausgabe

Binder <Informatik>

Quick-Sort

Computeranimation

Knotenmenge

Funktion <Mathematik>

Ein-Ausgabe

Computerunterstützte Übersetzung

Gerade

Funktion <Mathematik>

04:42

Portscanner

Virtuelle Maschine

Bit

Lineare Regression

Klasse <Mathematik>

Klassische Physik

Punkt

Neuronales Netz

Skalarfeld

Computeranimation

Neuronales Netz

Funktion <Mathematik>

06:00

Resultante

Demo <Programm>

Bit

Wellenpaket

Gewicht <Mathematik>

Ortsoperator

Hyperbelverfahren

Kombinatorische Gruppentheorie

Zentraleinheit

Physikalische Theorie

Computeranimation

Data Mining

Eins

Multiplikation

Negative Zahl

Knotenmenge

Vorzeichen <Mathematik>

Trennschärfe <Statistik>

Gruppe <Mathematik>

Lineare Regression

Spirale

Datentyp

Punkt

Datenstruktur

Neuronales Netz

Bildgebendes Verfahren

Gerade

Funktion <Mathematik>

Einfach zusammenhängender Raum

Managementinformationssystem

Addition

Datennetz

Kategorie <Mathematik>

Inverse

Ein-Ausgabe

Dreieck

Linearisierung

Portscanner

Rechter Winkel

Wort <Informatik>

Ordnung <Mathematik>

Software Engineering

Neuronales Netz

10:50

Matrizenrechnung

Nichtlinearer Operator

Lineares Funktional

Datennetz

Zehn

Fluid

Browser

Computeranimation

Portscanner

Multiplikation

Tensor

Prozess <Informatik>

Datentyp

Punkt

Neuronales Netz

Neuronales Netz

11:56

Virtuelle Maschine

Knotenmenge

Prognoseverfahren

Matrizenring

Punkt

Sichtenkonzept

Tensor

Zehn

Datentyp

Vektorraum

Ordnung <Mathematik>

Quick-Sort

Computeranimation

12:44

Einfach zusammenhängender Raum

Programmiersprache

Addition

Matrizenrechnung

Nichtlinearer Operator

Datennetz

Euklidischer Raum

Hausdorff-Dimension

Matrizenrechnung

Versionsverwaltung

Zahlenbereich

Vektorraum

Raum-Zeit

Quick-Sort

Teilbarkeit

Computeranimation

Knotenmenge

Multiplikation

Tensor

Datentyp

Dimension 2

Leistung <Physik>

13:56

Resultante

Matrizenrechnung

Nichtlinearer Operator

Addition

Gewicht <Mathematik>

Datennetz

Vektorraum

Extrempunkt

Ein-Ausgabe

Computeranimation

Multiplikation

Bildschirmmaske

Tensor

Wort <Informatik>

15:13

Sichtbarkeitsverfahren

Lineares Funktional

Datennetz

Extrempunkt

Quick-Sort

Computeranimation

Übergang

Metropolitan area network

Bildschirmmaske

Prognoseverfahren

Mereologie

Computerunterstützte Übersetzung

Funktion <Mathematik>

16:10

Softwaretest

Wellenpaket

Datennetz

Selbstrepräsentation

Extrempunkt

Ein-Ausgabe

Quick-Sort

Computeranimation

Motion Capturing

Informationsmodellierung

Prognoseverfahren

Rechter Winkel

Mereologie

Bildgebendes Verfahren

Hilfesystem

Software Engineering

Funktion <Mathematik>

17:21

Softwaretest

Sichtbarkeitsverfahren

Resultante

Lineares Funktional

Subtraktion

Einfügungsdämpfung

Datennetz

Computeranimation

Eins

Richtung

Computerunterstützte Übersetzung

Figurierte Zahl

Funktion <Mathematik>

18:13

Resultante

Bit

Einfügungsdämpfung

Subtraktion

Gewicht <Mathematik>

Wellenpaket

Minimierung

Zahlenbereich

Term

Physikalische Theorie

Computeranimation

Richtung

Knotenmenge

Gradientenverfahren

Algorithmische Lerntheorie

Hilfesystem

Bildgebendes Verfahren

Lineares Funktional

Datennetz

Relativitätstheorie

Bitrate

Entropie

Fehlermeldung

Standardabweichung

Neuronales Netz

20:45

Bit

Datennetz

Algorithmische Lerntheorie

Computeranimation

Neuronales Netz

Übergang

21:45

Inklusion <Mathematik>

Matrizenrechnung

Nichtlinearer Operator

Subtraktion

Datennetz

Datenmodell

Vektorraum

Quick-Sort

Computeranimation

Multiplikation

Informationsmodellierung

Tensor

Reelle Zahl

Ablöseblase

Bildgebendes Verfahren

22:55

Resultante

Matrizenrechnung

Nichtlinearer Operator

Wellenpaket

Datennetz

Mathematisches Modell

Matrizenrechnung

Einfache Genauigkeit

Supercomputer

Computeranimation

Graphikprozessor

Multiplikation

Prognoseverfahren

Tensor

Rechter Winkel

Supercomputer

Ordnung <Mathematik>

Zentraleinheit

Bildgebendes Verfahren

24:19

Transinformation

Wellenpaket

Division

Zahlenbereich

Kartesische Koordinaten

Elektronische Publikation

Google Street View

Computeranimation

Großrechner

Metropolitan area network

Virtuelle Maschine

Deskriptive Statistik

Informationsmodellierung

Supercomputer

Digitale Photographie

Zeitrichtung

Projektive Ebene

Verzeichnisdienst

Bildgebendes Verfahren

25:29

Offene Menge

Konstruktor <Informatik>

Wellenpaket

Datennetz

Open Source

Machsches Prinzip

Zahlenbereich

Datenfluss

Computeranimation

Warteschlange

Spezialrechner

Graphikprozessor

Virtuelle Maschine

Energiedichte

Wellenpaket

Datennetz

Datentyp

Programmbibliothek

Vorlesung/Konferenz

Projektive Ebene

Rippen <Informatik>

Programmierumgebung

Zentraleinheit

26:31

Wellenpaket

Gewicht <Mathematik>

Mathematisierung

Mathematisches Modell

Computeranimation

Graph

Knotenmenge

Variable

TUNIS <Programm>

Konstante

Datentyp

Speicherabzug

Gruppoid

Datenstruktur

Einfach zusammenhängender Raum

Nichtlinearer Operator

Datennetz

Zehn

Graph

Freier Parameter

Programmierumgebung

Ein-Ausgabe

Datenfluss

Sinusfunktion

Konstante

Garbentheorie

Tensor

Speicherabzug

Datenfluss

27:51

Subtraktion

Wellenpaket

Mathematisierung

Matrizenrechnung

Aggregatzustand

Extrempunkt

Division

Computeranimation

Metropolitan area network

Graph

Trigonometrische Funktion

Multiplikation

Tensor

Sigmoide Funktion

Konstante

Speicherabzug

Programmbibliothek

Multitasking

Gruppoid

Gravitationsgesetz

Gammafunktion

Schnittstelle

Addition

Nichtlinearer Operator

Adressierung

Matrizenring

Zehn

Logarithmus

Division

Stichprobe

Disjunktion <Logik>

Vorzeichen <Mathematik>

Mailing-Liste

Programmierumgebung

Datenfluss

Speicherbereichsnetzwerk

Ranking

Quick-Sort

Variable

Arithmetisch-logische Einheit

Warteschlange

Diskrete-Elemente-Methode

Flächeninhalt

Tensor

Mereologie

Reelle Zahl

p-Block

Aggregatzustand

28:39

Inklusion <Mathematik>

Dualitätstheorie

Wellenpaket

Zehn

Mathematisches Modell

Zahlenbereich

Extrempunkt

Datenfluss

Computeranimation

Kernel <Informatik>

Spezialrechner

Virtuelle Maschine

Metropolitan area network

Tensor

Notebook-Computer

Datentyp

Datenfluss

ART-Netz

Bildgebendes Verfahren

Gammafunktion

Demo <Programm>

30:20

Metropolitan area network

Spezialrechner

Shape <Informatik>

Spannweite <Stochastik>

Pixel

Einheit <Mathematik>

Große Vereinheitlichung

Bildgebendes Verfahren

Computeranimation

Array <Informatik>

Linearisierung

31:06

Bit

Shape <Informatik>

Wellenpaket

Zahlenbereich

Plot <Graphische Darstellung>

Ausnahmebehandlung

Vektorraum

Ein-Ausgabe

Computeranimation

Homepage

Metropolitan area network

Energiedichte

Rechter Winkel

Datennetz

URL

Kantenfärbung

Partikelsystem

Neuronales Netz

Bildgebendes Verfahren

Gerade

32:46

Einfach zusammenhängender Raum

Pixel

Gewicht <Mathematik>

Wellenpaket

Datennetz

Freier Parameter

Zehn

Einfache Genauigkeit

Zahlenbereich

Ein-Ausgabe

Datenfluss

Computeranimation

Eins

Metropolitan area network

Variable

Flächeninhalt

Wellenpaket

Bildgebendes Verfahren

Neuronales Netz

34:34

Matrizenrechnung

Einfügungsdämpfung

Wellenpaket

Gewicht <Mathematik>

Selbstrepräsentation

Stapelverarbeitung

Zahlenbereich

Gradient

Extrempunkt

Computeranimation

Richtung

Metropolitan area network

Informationsmodellierung

Variable

Multiplikation

Wellenpaket

Gradientenverfahren

Gravitationsgesetz

Ganze Funktion

Funktion <Mathematik>

Lineares Funktional

Freier Parameter

Mathematik

Globale Optimierung

Ein-Ausgabe

Datenfluss

Entropie

Neuronales Netz

36:00

Inklusion <Mathematik>

Lineares Funktional

Bit

Einfügungsdämpfung

Wellenpaket

Total <Mathematik>

Extrempunkt

Stapelverarbeitung

Computeranimation

Richtung

Teilmenge

Softwaretest

Gradientenverfahren

Große Vereinheitlichung

Fehlermeldung

36:58

Teilmenge

Metropolitan area network

Softwaretest

Wellenpaket

Stichprobenumfang

Güte der Anpassung

Stapelverarbeitung

Sondierung

Ganze Funktion

Große Vereinheitlichung

Computeranimation

Software Engineering

37:52

Nichtlinearer Operator

Sichtenkonzept

Wellenpaket

Zehn

Stapelverarbeitung

Computerunterstütztes Verfahren

Datenfluss

Computeranimation

Mapping <Computergraphik>

Metropolitan area network

Softwaretest

Whiteboard

Tensor

Mereologie

Wort <Informatik>

Neuronales Netz

38:47

Soundverarbeitung

Bit

Wellenpaket

Datennetz

Selbst organisierendes System

Stapelverarbeitung

Computeranimation

Metropolitan area network

Virtuelle Maschine

Mailing-Liste

Softwaretest

Whiteboard

Datennetz

Tensor

Mereologie

Speicherabzug

Datenfluss

Gravitationsgesetz

Chi-Quadrat-Verteilung

Bildgebendes Verfahren

Demo <Programm>

Logik höherer Stufe

39:58

Bit

Gewicht <Mathematik>

Gewichtete Summe

Extrempunkt

Selbstrepräsentation

Zahlenbereich

Raum-Zeit

Computeranimation

Kernel <Informatik>

Metropolitan area network

Tensor

Randomisierung

Multitasking

Bildgebendes Verfahren

Funktion <Mathematik>

Pixel

Datennetz

Computersicherheit

Gebäude <Mathematik>

Einfach zusammenhängender Raum

Ein-Ausgabe

Quick-Sort

Einfache Genauigkeit

Faltungsoperator

Mereologie

Ablöseblase

URL

43:11

Wellenpaket

Datennetz

Minimierung

Singularität <Mathematik>

Computeranimation

Gradient

Netzwerktopologie

Metropolitan area network

Informationsmodellierung

Regulärer Graph

Mereologie

ATM

Gradientenverfahren

Funktion <Mathematik>

44:12

Metropolitan area network

Datennetz

Fuzzy-Logik

Rechter Winkel

Iteration

Stapelverarbeitung

Ordnung <Mathematik>

Speicherbereichsnetzwerk

Computeranimation

Funktion <Mathematik>

45:06

Lineares Funktional

Einfügungsdämpfung

Wellenpaket

Fächer <Mathematik>

Selbstrepräsentation

Kartesische Koordinaten

Ungerichteter Graph

Zeiger <Informatik>

Elektronische Publikation

Datenfluss

Whiteboard

Computeranimation

Metropolitan area network

Whiteboard

Tensor

Große Vereinheitlichung

Funktion <Mathematik>

46:01

Lineares Funktional

Multifunktion

Einfügungsdämpfung

Wellenpaket

Inverse

Versionsverwaltung

Abgeschlossene Menge

Computeranimation

Metropolitan area network

Variable

GRASS <Programm>

Entropie

Neuronales Netz

46:53

Lineares Funktional

Einfügungsdämpfung

Gewicht <Mathematik>

Datennetz

Graph

Extrempunkt

Gebäude <Mathematik>

Selbstrepräsentation

Zoom

Ein-Ausgabe

Login

Code

Computeranimation

Objekt <Kategorie>

Metropolitan area network

Informationsmodellierung

Mereologie

Wort <Informatik>

Ordnung <Mathematik>

Ext-Funktor

Bildgebendes Verfahren

Funktion <Mathematik>

48:18

Metropolitan area network

Subtraktion

Wellenpaket

Zehn

Programmbibliothek

Datenfluss

Gerade

Computeranimation

Neuronales Netz

49:13

Subtraktion

Hardware

Datennetz

Mathematisches Modell

Matrizenrechnung

Datenmodell

Datenfluss

Computeranimation

Kreisbogen

Mathematisches Modell

Virtuelle Maschine

Informationsmodellierung

Multiplikation

Mereologie

ATM

Programmbibliothek

49:59

Server

Datenmodell

Indexberechnung

Zahlenbereich

Teilgraph

Computeranimation

Rechenschieber

Graphikprozessor

Graph

Virtuelle Maschine

Mathematisches Modell

Datensatz

Informationsmodellierung

Task

Bit

Code

Tensor

Login

ATM

Heegaard-Zerlegung

Vollständiger Graph

Datenfluss

Zentraleinheit

50:48

Webforum

Server

Subtraktion

Gewicht <Mathematik>

Wellenpaket

Versionsverwaltung

Zahlenbereich

Extrempunkt

Term

Synchronisierung

Computeranimation

Intel

Graph

Graphikprozessor

Mathematisches Modell

Metropolitan area network

Informationsmodellierung

Task

Bit

PCMCIA

Code

Datentyp

Stochastische Abhängigkeit

Chi-Quadrat-Verteilung

Nichtlinearer Operator

Transinformation

Cookie <Internet>

Datenmodell

Indexberechnung

Datenfluss

Variable

Speicherbereichsnetzwerk

Tensor

Parametersystem

ATM

Client

Server

Cloud Computing

Datenfluss

Personal Area Network

Zentraleinheit

51:47

Parametersystem

Wellenpaket

Mathematisches Modell

Iteration

Geräusch

Winkel

Extrempunkt

Synchronisierung

Computeranimation

Metropolitan area network

Informationsmodellierung

Dienst <Informatik>

Diskrete-Elemente-Methode

Heegaard-Zerlegung

Server

Client

Gravitationsgesetz

Chi-Quadrat-Verteilung

Gammafunktion

52:42

Softwaretest

Netzwerktopologie

Expertensystem

Weg <Topologie>

Wellenpaket

Twitter <Softwareplattform>

Datennetz

Algorithmische Lerntheorie

Ordnung <Mathematik>

Figurierte Zahl

Speicherbereichsnetzwerk

Ereignishorizont

Computeranimation

54:12

Einfach zusammenhängender Raum

Managementinformationssystem

Zentrische Streckung

Parametersystem

Telekommunikation

Wellenpaket

Datennetz

Momentenproblem

Keller <Informatik>

Zahlenbereich

Extrempunkt

Computeranimation

Metropolitan area network

Virtuelle Maschine

Knotenmenge

Diskrete-Elemente-Methode

Server

Wort <Informatik>

Ordnung <Mathematik>

55:36

Managementinformationssystem

Nichtlinearer Operator

Lineare Regression

Hardware

Zehn

Automatische Handlungsplanung

Extrempunkt

Computeranimation

Graph

OISC

Uniforme Struktur

Einheit <Mathematik>

Bit

Tensor

Forcing

Digitale Photographie

Tensor

Ordnung <Mathematik>

Datenfluss

56:29

Subtraktion

Web Site

Momentenproblem

Mathematisches Modell

Term

Mathematische Logik

Computeranimation

Übergang

Eins

Metropolitan area network

Virtuelle Maschine

Multiplikation

Algorithmus

Datentyp

Visualisierung

Große Vereinheitlichung

Parametersystem

Graph

Zehn

Güte der Anpassung

Zwei

p-Block

Biprodukt

Datenfluss

Quick-Sort

Arithmetisch-logische Einheit

Rechter Winkel

Mereologie

Wort <Informatik>

Neuronales Netz

Standardabweichung

### Metadaten

#### Formale Metadaten

Titel | Deep Learning with Python & TensorFlow |

Serientitel | EuroPython 2016 |

Teil | 148 |

Anzahl der Teile | 169 |

Autor | Lewis, Ian |

Lizenz |
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben |

DOI | 10.5446/21151 |

Herausgeber | EuroPython |

Erscheinungsjahr | 2016 |

Sprache | Englisch |

#### Inhaltliche Metadaten

Fachgebiet | Informatik |

Abstract | Ian Lewis - Deep Learning with Python & TensorFlow Python has lots of scientific, data analysis, and machine learning libraries. But there are many problems when starting out on a machine learning project. Which library do you use? How do they compare to each other? How can you use a model that has been trained in your production app? In this talk I will discuss how you can use TensorFlow to create Deep Learning applications. I will discuss how it compares to other Python machine learning libraries, and how to deploy into production. ----- Python has lots of scientific, data analysis, and machine learning libraries. But there are many problems when starting out on a machine learning project. Which library do you use? How do they compare to each other? How can you use a model that has been trained in your production application? TensorFlow is a new Open-Source framework created at Google for building Deep Learning applications. Tensorflow allows you to construct easy to understand data flow graphs in Python which form a mathematical and logical pipeline. Creating data flow graphs allow easier visualization of complicated algorithms as well as running the training operations over multiple hardware GPUs in parallel. In this talk I will discuss how you can use TensorFlow to create Deep Learning applications. I will discuss how it compares to other Python machine learning libraries like Theano or Chainer. Finally, I will discuss how trained TensorFlow models could be deployed into a production system using TensorFlow Serve. |