Merken

# Machine learning and applications

#### Automatisierte Medienanalyse

## Diese automatischen Videoanalysen setzt das TIB|AV-Portal ein:

**Szenenerkennung**—

**Shot Boundary Detection**segmentiert das Video anhand von Bildmerkmalen. Ein daraus erzeugtes visuelles Inhaltsverzeichnis gibt einen schnellen Überblick über den Inhalt des Videos und bietet einen zielgenauen Zugriff.

**Texterkennung**–

**Intelligent Character Recognition**erfasst, indexiert und macht geschriebene Sprache (zum Beispiel Text auf Folien) durchsuchbar.

**Spracherkennung**–

**Speech to Text**notiert die gesprochene Sprache im Video in Form eines Transkripts, das durchsuchbar ist.

**Bilderkennung**–

**Visual Concept Detection**indexiert das Bewegtbild mit fachspezifischen und fächerübergreifenden visuellen Konzepten (zum Beispiel Landschaft, Fassadendetail, technische Zeichnung, Computeranimation oder Vorlesung).

**Verschlagwortung**–

**Named Entity Recognition**beschreibt die einzelnen Videosegmente mit semantisch verknüpften Sachbegriffen. Synonyme oder Unterbegriffe von eingegebenen Suchbegriffen können dadurch automatisch mitgesucht werden, was die Treffermenge erweitert.

Erkannte Entitäten

Sprachtranskript

00:00

so I will I will present their work on machine learning and I'm with the billing big data centers and also with you believe in and I 7 affiliation with a Korea University so

00:15

because I don't expect people to know what machine learning is In all details and I have try to to be explained very simply on 1 side and also give something for the people already know so it's very technical what I'm saying but since you're bunch mathematicians you should be given to to take so this is basic so what is machine learning about we'd like to learn from data which means that we have data and this is these are some points into the and to be data is living in very high-dimensional spaces and it has some labels y and this could be continuous or or these could be discrete like in this classification problem where we want to distinguish between red and you so the only thing that people in machine learning do is they try to infer some unknown mapping between X and Y assuming that there's some joint probability distribution that only God knows OK and the 1 important point here is that all the mass that is behind this um and tries to make sure that you can generalize meaning that you can actually get this mapping right for unseen data which is a very strange concept if you think about it because you have to actually you know from given data you have to infer it to something that you haven't seen OK so you but I show you this is like the function f that you tried to to estimate this could be a function of its of its we complex this could be also another function and is infinitely many functions which 1 is the optimal 1 that's the question optimally generalizing 1 so if I show you this data point then most of you would attribute to red label to the state of the art because you're in your brain in b by eyeballing and you do some tools the inference and their and put the red color to this case so and of course again if you have a million dimensional data are graphical dimensional data and you cannot I both things so this means that you need to have a proper mathematical background to put this in place and this is what all fields of and apart from this nice theory we actually um have a bunch of methods that actually work very well and 1 of them was already mentioned the neural networks and and another 1 that is is kernel methods like support vector machines and a lot people are using them and I will 1st tell you the idea about this and then will continue and see how how they can be so the

03:34

kernel method basically is we have some function

03:40

and W is the parameters and and phi is some mapping x is the data that you have have sign as the size OK so the idea of a support vector machine is the following so you have some data obtained again now red and green not red and you see the difference no meanings them so we map this data to some high-dimensional space the man to mapping phi that's here the and in this high-dimensional space we do something very simple for which we can prove and by the virtual doing the optimal thing in this high-dimensional space and we do the optimal nonlinear thing in the original space think about and that you to you met your data with polynomial Hs and you have a picture with thousand by thousand pixels which is about 1 million pixels and you map this with a polynomial of 10th order then you have approximately 1 million to the attend dimensions in this space which is quite large and then you do something very simple there for which you can prove and in the original space you have your function that classifies very well images when I was said no among the the 1st people who actually worked on this together with my mentor public and my students spend Alex and and so in this this moment and we're all told other things so the field called pattern recognition of the field called statistics always told us that we should not use many features we should use as few as possible and we're doing exactly the opposite and the reason why we can do this is the very nice theory that allows us to show that if we're doing the optimal thing here than we're doing also an optimal nonlinear that so that's in a nutshell what what the what the idea and practically a we

06:10

solve an optimization problem which is a quadratic program and we go to the dual and then this gives us the solution and for for this support vector machine so for those of you who haven't seen quadratic programs ignore it and for those of you have seen quadratic programs this is a nice quadratic program and so the other and learning machine

06:43

that is very popular these days is a neural

06:46

network and now neural networks have been around for quite a while and in fact they have been a dominating the brains of of the species there

06:59

a a couple million years but in science and or in computer science the real networks have been in Oak coming up and sometimes in the sixties last century and and M. basically they take the following structure so back in the sixties there was they were called perceptrons and this was just a bunch of inputs connected with some weights that summed up the the activities of the inputs went through a non-linearity and that was it a so that's a perceptron no since perceptrons or not very capable of doing any really interesting nonlinear stuff and people have thought that putting stacking a bunch of perceptrons next to each other and on top of each other and that this will help so basically neural networks on a universal function approximator so you have some input you take this input sum it up you have some weights here and um put it through a nonlinearity that's here and you do this again and again and again and again and then became bumps so just for the mathematicians so that this is a universal function approximator and I show you with a sheet of paper can so what does it do so that every component every neurons basically is this nonlinear function is a signal with a pen so it's it's this is a signal in a high-dimensional space that's it a and with the parameters you can make this even flat or of steam so now you take another signal head so this once again with the other 1 so you have this OK and the next layer so you have a rich and then in the next thing we take another rich and then you have a bomb and with a bump you can do that that's the universal approximation proof in with a sheet of paper so so these neural networks as and I'm quite but useful and then and the question is why are they getting such a hot method to it again and the reason is that because nowadays we have a GPU was so we have much more computing power and also we have a lot of data we have big data these days so in of course uh there's nothing informative about dictate data so the data is just think so it becomes information when when you actually and ask great questions to it interesting question questions so when you do an interesting model but the interesting point about neural networks is that they're very efficient in the sense that they can deal with a lot of data that's what I mean by this is say if you want to solve a quadratic part a program with a million parameters then you take approximately between a million square on million Q times computing time in terms of the variables that neural network scales linearly so that's a good thing and this is why they used with so much so now if you think about this i put my sophistications had not compare if you think about estimators in general then there's something good about a lot of data because and if you want to have an estimate something the more data you get them better you get to the solution of the closer you get there and if you have a decent estimator then this goes like 1 over n where n is the number of data points until you reach the noise level which means that if you have a lot of data this is good and most of the techniques in machine learning outside neural networks are not able to make use of a lot of data and so that's why they are now very popular OK so 1 thing kernel methods is that kernel methods use this mapping to feature space you have to say what the feature spaces for you it's some new space and so you have a representation this hilbert space that tells you how to compare things nonlinearly here the neural network learns this representation by itself from data and which means that also multi-scale information can be learned and so you have to say beforehand that's the important point so machine learning has been a huge economic sector and it is being used in self-driving cars in the best pattern

12:18

recognition methods all the best speech recognition methods these days worked with machine learning with the networks we find Higgs particles with it we do a neuro science with that use it in your cell phones when you use Google Facebook Amazon what not um and we can do also some decent signs in it OK so of the cutest felt everywhere but I

12:51

m much more interested in is to to use machine learning as some kind of an enabling technologies for the science so similarly like them mathematical modelling is enabling technology for the Sciences machine learning is an enabling technology for the science it comes from a different corner in mathematical modeling and you make up your mathematical kinetic model here you data driven of Gross these things are not as far apart as they seem so they're actually growing together any reasonable person would think this is a good idea so 1st I will tell you a bit about my neuro science hobby that I've been pursuing since about 2 22 and this is the Berlin Brain

13:47

Computer Interface project that to um you know there's 3 principal components of that and that's the end of Benjamin Blankertz was a professor at 2 1 0 and W cool you was a medical doctor charity and myself so we came to the conclusion that it would be interesting to an given brain signals to understand what's the what cognitive state the brain was in In other something like mind well of course this is very simple minded mind-reading so but let me give you the motivation why people are interested in this this from the medical side as the number of that so-called locked in patients so they have no means for communication and they have intact brains the so if you could read their brain signals and decode them and translate them into a control signal of playing my wheelchair some writing device they could communicate with the outside world so this was 1 of the motivations that that people have and so far me because I'm a machine learner this looks like a machine learning problem because you have the gene you have a say 60 channels of EEG have with thousand hertz each and you extract some features and you classify and and what I would be showing you is a translation of human intentions into a some control signal without using muscle activity or any kind of paraffin because of course if I start reading might years or you know making faces then there's some muscle activity that will reflect itself also brain but this is not available to a less patients because they cannot you so if we want to help this patient group on the long run we need to be able and communicate with all these are without muscle activity practically and this is a complicated thing so you have this multivariate EEG and and in real time you have to do all sorts of filtering you have to remove artifacts and you have to be having a lot of feature extractors that put in all the medical knowledge of neuro science knowledge that we have about the brain doing things and then this stack this into huge vector and then we put this through our favorite learning machine this could be a support vector machine this could be some linear discriminant analysis it could be a neural network was not so when when

17:12

I got interested in this and this was about 22 years ago and the state what state of the art was that and the subjects had to to to be trained for about 100 to 300 so which means that people have to wear in Aegean people and when EEG cap like that and then they had to to be trained with biofeedback in order to get out some decent signal so we said well let's do it the other way round that the subjects think whatever they think and that's just have the machines learn that whatever needs to be learned and this reduced the necessary training time to about 5 minutes so before this was maybe a dozen groups in the world that worked on this now it's about more than 400 or maybe even 1 and 2 groups in sciences and industry doing all sorts of things with the brink so what are the applications of course 1 application is hopeful patient and this could be the ALS patients all this could be also patients with stroke where we would enhance review rehabilitation but just to get the better idea of what can be done right and and this is an old view but is still alive so you have the subject here wearing the cap OK so the data goes in into the amplifier From the amplified goes to this laptop here and the letter um gives a control signals to the screen and moves the cursor around OK and so as a subject you have too soon different brain states and forward decoding was done right we can move the subject can move the cursor to play a game which is the game of brain pong so subject is sitting still not doing anything it's it's not using the eyes and it's controlling through the virtual their brain signals this because of Don so in real time this is decoded now I was the subject you can often what did you think about k in order to get this done so 1st of all and I will teach a bit of physiology so if I would wave my hand like this to you then on my left hemisphere over the motor cortex this some activation in fact there's neither rhythm that is suppressed if I wave my left hand then the ideal rhythm is suppressed on the right side now the interesting thing and this is not something that we invented but this is no 1 on physiology and if we only imagined that without any waving without any movement we still have the same effect as the suppression of idle rhythm although the contralateral motor cortex and then this we can use for decoding purpose and this is being used as saw so so you see that in principle you can get 1 bit out but you can get much more bits out so there we know nowadays and depending on on the speller paradigms you can you can get 10 bit social community man of which so so we had demo on this and Siebert which was all already

21:25

8 7 letters in so that's the sort ready so this helpful patients that I think the interesting point is we can use this device to understand what the brain is actually doing right because the brain is a machine that is thinking and behaving really real time so so we have some other device that they can decode these activities and we can do interesting that studies and that help us to understand how the brain dead also there's something which is um quite interesting which is we can use this in technology so you've already seen someone play a game that you could so if you can imagine all sorts of interaction with the new channel that is provided by the brain that you can decode in real time and so the most I'm obvious ones that I would like to tell you about and unfortunately Professor Kleiner is not here anymore so I could complain to him that M I mean this is more of a joke so some some some years ago when this Excellence Initiative thing happened right so the university's thought about strange project and part as part of some excellent center some you know moonshot and I have a colleague in Berlin Islamists at 2 must be to must be again is 1 of the fathers of H 2 6 4 which is the current video coding stand out and every 2nd beating the Internet is coded by the so using it all the time without knowing and so I suggested to promise wrote this MP3 where people have studied very carefully what the EU can perceive and so why why should we in study with the brain computer interface what we can perceive in video and just code that so that was the idea and we wrote this and of course and sure enough the reviewers said at what not on like things to them and their but we did it anyway and universe is a line of research this number of papers and the role and then it actually what's the point is not to use a brain computer interface to you know change something in your video cold but the point is to learn something about perception and so we learn something about perception and we could use this later and impact in particular that a change I people have used this to an extent that they could improve the video coding and for example and there was a Champions League final brought is that by sky I think it was the 1 and a half years ago or something like that and from linear so they the there there have been recording being used with the new 1 with this new mindset and the interesting part is you can say something maybe this was you a couple per cent right but if you think about a couple % on a global scale because every 2nd that you know of the internet is is broadcasted by H 2 6 forwards this new coding standard it amounts to a couple of nuclear power plants in terms of energy consumption so the fact that you can actually understand something about the brain translate this into a better coding and and I'll make a planet of the that this may be a bit and obvious and it's clear it was clearly beyond the scope of these referees so the conclusion from this is that it doesn't matter what people say hey review stuff it it just doesn't work as so and another party

25:49

that would like to show you and it's it's maybe equally funny but even more heretic of care so please prepare for some heresy so this was

26:03

in 2011 and there's a very nice place on this planet which is called an iPad Institute for Pure and Applied Mathematics and this is at UCLA and I was invited to participate in the program there for 3 months and and so on my sabbatical was anyway up so as not there's nothing wrong with going to California so you can't can't do anything wrong coming that I realize that this was a program about quantum chemistry so it's not really a machine learning but I happen to have a history of being a theoretical physicist and who got some training in quantum mechanics and and did some strings and things like that in my past so I I heard people talk about the Schrödinger equation and there's a little bit Schrödinger equation is a is a wonderful and these that you can see here um and it's a very complicated equation that cannot be solved other than in approximation and people got nobel prices for these approximations it's called density functional theory In fact when I 1st heard DFT I thought what are they talking this is not discrete Fourier transform that they're talking about so coming and you know to this remark about a joint language that you have to develop OK so an when when I sat in these talks then I thought well this is very nice so people come from 1st principles and then there they do all the things from 1st principles and at some point they make an approximation so I thought why not same trees the Schrödinger equation as a black box and just this as a prediction problem so this is as if somebody takes you know obvious Stokes says everything that goes matters everything that is result medals and in between artists some can just predicted neural network with what so this is what I suggested to the people and of course they were not amused and they they were not as you know clear on how they would kill me so I survived and vast the interesting point was that we we actually as always and there was some young people infected um and but useful but I skeptical and um on tool for newly infected with or whether will let's try this idea to crazy 1 and then what they did was they generated a lot of data that with DFT so DFT for small molecules in a reasonable artifacts summation per molecule costs about 5 moles of computing time so you know next to the climate people these are the guys that use all you computed in so the data generated something like 7 thousand molecules with all the property i'm and that was all training data in fact it was not all training data but we thought we take part of it thousand as training build a model from it of prediction model and on the rest of the data that we haven't seen before and whether or not we can predict the outcome of the show which OK so this was 2011 and in 2012 all 1st period of his red-letter appeared on so the way we did this was

30:07

the following so 1st of all in machine learning and you need the input so the input of the molecule would be an so this is now molecule and you take the coordinates and nuclear charges of all the molecules and then I then you say well I have to represent and the similarity of molecules of 2 molecules somehow and a good way of representing the similarity of molecules would be and the so it's the similarity within a molecule is to to um write down something which is which we called who matrix so we take the could all forces between the eyes in the j-th atom and white this into them into a matrix so am i j the matrix element between the eyes and 2 days atom is just the nuclear charges or what the call lawful so this is a matrix representation is not effective but but maybe people have told you that in machine learning we can deal with vectors we can deal with mattresses we can deal with 10 users we can deal with graphs and anything 1 and whatever you have and so this is now a matrix of presentation and then we can take a say if we compared to more molecules mn and Prime as as represented here we just take it from being this difference between 10 and then we do something very simple I insisted that would do something simple

31:42

we take we 1st put this into galson so this is the kind of model and then we do kernel ridge regression something that is really the most stupid nonlinear model that you can think of 1 but a very powerful 1 and then you know and so it can be solved in closed form so computing time is need make little prediction time is indicative so but we

32:10

haven't and was a 10 K come from as quality which is not good enough but not bad so if we take the mean of across all the molecules as a predictor this would be around 350 k come the to give difference so a mechanical approximation would be some they get a decent 1 with these 3 for OK because something then all a bit later we use the neural nets we were down at 3 come and then 2015 we already have 1 K and recently we had the 1 K color which is chemical accuracy this is out of sample so we're just taking the molecular properties and just predicting and of course we can also predict instead of energies we can predict authorizations and everything within you know less than a millisecond so let me before I

33:25

come back to this cycle I will come back to this this more let me give you some perspective on machine so if you have a machine learning model

33:46

this is a highly nonlinear East for example a deep neural network of all these layers very complex very nonlinear so you put something inside in this you know and then neural-network answers to this for some classifies this input image into a the respective class so this is a classification a problem where you have a couple of thousand classes roosters fossils caused to do and what not right and so the neural network actually gives you the correct prediction and you wonder why doesn't give you this prediction so there's a field in machine learning which is called an feature selection so in feature selection you have all your huge bulk of data and you ask the question what of the inputs is the most salient ones and people use less water at a candidate indeed they play the in 1 tree and in order to find a few variables that are responses now the interesting thing is if you have you know hold hold bulk of data it is nice to know what is generally interested interesting but I think if we think about medical diagnosis say you couldn't care less about the diagnoses all the ensemble so you would like to know what is your individual diagnosis you would like to know what are the individual variables that the model things I important for this particular decision making and the is the reason why people in the sciences in general have been only you working with linear models is because there is a way back in in the linear model it's an obvious way back OK so if I have this linear classifier in space and I put data point here then I know that this direction is the 1 that is reasonable but if I have this very highly

36:06

nonlinear whatever the classifier how do I get back what is the right things and this was an unsolved problem that is solved now for any kind of non-linear machine learning models for neural networks for kernel machines and so on this was by as a Boston

36:25

boss who and we with and at all and I'm also among these all them so the idea is the following so we we take the classification and we go backwards From the the results of the classification to the input and make a probability map heat maps which tells you which of these pixels in this case and is the most salient ones for this particular case and you see you know it's there it's this this thing here of most of the head part of the rooster that makes that this is considered the most salient for this part of can so I I would just

37:12

give you an idea about this and so is mathematically it's very beautiful because you can you can have this is a very highly complex nonlinear things right and so you could do some kind of a Taylor expansion around so the Taylor expansion is not so helpful because it's it would be a global thing and you would have to get many orders so we could show that if you take a Taylor expansion that is local local firing neurons and then then it's easy to do with locally linear Taylor expansion that becomes global does the global nonlinear thing in this manner and it's called detail at this very popular paper and and they're in this manner you can show something some properties some mathematical properties and as you can understand this problem and so let me just give you an idea so you see you have the picture that goes in here like this lady that think and then you have your neural network that classifies this as

38:20

a is sorry not ladybird but ladybug and so this is considered a ladybug over the network and now what we're doing is we're going backwards and and we call this relevance propagation so this node was the most relevant and we say we propagate this this relevance backwards and a and basically you take all the activities of the network output you you have already the weights because the network has been trained and if you want to know the relevance here then you sum this up as appropriately normalized with the activity that you got from the forward pass and you can in this this theoretical interpretation in terms of Taylor

39:09

expansion and so on so you can do all this and you can assume that relevance doesn't increase doesn't become more less so you there's a relevance and conservation property that you need in this and so when you see OK this is the part that this mostly the body OK so here's and have

39:33

pictures and spiders and the came up and then you see of course cats come in front of different a backgrounds and with different shapes and races and everything so so all the feature selection aspects would be completely nonsensical because they would say a cat is in the middle of the picture they would not say you know what is the QAP like thing here now this is also an

40:05

interesting 1 so people in machine learning and they and I just told you that they are obsessed by the generalization error they want to optimize the unseen the error and that they want to minimize the error on unseen data so assume that you take models in this case this is a deep neural network not even trained by us this was trained by the will and this 1 is the Fisher vector model which is a very popular and computer vision model and we put this image inside and we get when maps out now if we look at the out of sample performance of both models they're the same OK so both generalize equally well but they seem to solve the problem differently where is this beast here looks at the horses this 1 looks at this lower left corner and if you look at the

41:15

lower left corner and I read to you then there's there's this tag which says that the baby the fault of the a so this is a public database of 20 million pictures nobody has ever looked at them but they used as a benchmark for measuring the generalization error so all of these models do great job but they solved the problem differently so if we are using machine learning models in the sciences all engineering we should better know why behave the way they behave In this VI able to do what and this is extremely important because if we think about physics or chemistry or engineering we cannot afford to have this this this change yeah although this is the intelligent behaviour we looking at tags right and just happens that the holes class had this time nobody noticed that's intelligent behavior but not exactly what we are thinking about a model so

42:34

understanding is a key concept that we need if we using these kind of models and if we are engaging in modeling now and come back to this physics part is a recent paper that we just that just a few that I have to see I'm taking the and tools minutes in your coffee break so I will I think it's not something OK so this is the guys at at that

43:07

to work on that all this paper it appeared 9th of generate an edge of communications so we need to take care and deep tensor neural network and learn some optimistic representation same thing shooting at like uh equation again so basically again

43:29

the molecules are transformed into some features and now this is a big more complex so

43:38

we are now trying to use a neural network so so before we had this kernel that basically compared molecules by using differences between kernel matrices now we would like to have something that is like a vote to whack vector which is the representation in in and language analysis and where you put language context which is something symbolic into a vector bending embedding so in a sense we would like to understand what other local atomic properties within the molecule how can they represent be represented somehow as a vector or something and we learn this and so this is the bound a deep neural network for every atom we have this kind of representation where you see these equations and and and and then so so for every atom we give the energy contribution and then we sum it all up of course we can also have other contributions like polarisation source so that we can use this this model for this we try to a 2 0 and think about the underlying chemistry and and use them their local interaction profiles that molecules have and and implement our model that so there's this interesting feedback loop which means that in fact this innocently looking network is a network where you have the 1st model the local context of 1 item then you look at in the next step you feed this into the local context of 2 items uh correla uh and interactions and then into 3 atom injection songs of that you share some more the weights

45:36

so so this is if you roll out OK and there's some thoughts here now if you do this and and you go

45:45

across chemical compounds based which means that you take some data on some molecules you train the model and then you take some other data that you haven't seen before and then you below 1 K and if you take just the anatomic neighborhoods it's not enough if you take into actions pairwise interactions it was um interactions between pairs which along the graph it it it is becoming better you can do the same game for

46:19

and molecular dynamics can quantum-mechanically accurate molecular dynamics an entity of course you need much better accuracy is so we are way below 0 . 1 Caicos homework and free simulate some molecules like this 1 here and then we have very very close to 0 to the true quantum mechanical molecular dynamics simulation now this is all

46:51

good so this was the best model available that with the same architecture it could go across chemical compounds space and for 1 single molecule could predict molecular dynamics behave now OK so now you have this thing and everybody is thinking 0 this neuron network black box but it's not like any more so we can start looking at what did this thing implement what hasn't learned about chemistry and physics and M. so we can now look at for example a bunch of molecules and we can see where would then H atom bind away would you see a combined all where M so we can we look at them molecular properties like an arrow Metacity city and that on not exactly in in chemical databases in the way we could get so so I will try to explain this so in the chemical databases we find the energy is and Our much cities and all this stuff about food molecules but for example don't have the information if we take all of em molecules that have a benzene ring that some groups which all these many molecules has the most stable benzene ring which 1 has his most aromatic so this is not in the database and so we can infer that and that's a new tool that has learned some chemistry and and so on of course you know I'm very short and this explanation and I'm not that they're physicist unexpecting courses an who has been contributing to the so I'm I'm I'm saying it in the way of that I have understood from the computer science perspective but I think this is a very good starting point to no in fact it's it's it's a field that is strongly in uh is developing so so I think are already the year before so this was to his own 15 there were only in the U. S. there were 5 workshops on this topic it's good and substantial growth that we experience and if we think about it we can we have some some techniques that are blazing fast where we can learn something and M. the the the the point is that we need to put together insights from physics and chemistry and computer science and it's a hard task but it's very great and rewarding so I'm coming to the

50:01

conclusion so machine-learning is is of central it's a driving technology in big data it's 1 of the tools driving technology is database management is 1 machine learning is the other together it makes the data this is all philosophy of the but indicated center so if we use it form for neuro science and we can do Brain-Computer Interfacing could decoding brain states diagnoses what not In chemistry and we can be million times faster at high accuracy some hope on materials because if you think about this I played with molecules with also played with materials with them them much punk Institute of uh in higher layers with the Articles group and and and In searching for the superconductor so but that's the the beginning and then we much better with molecules and with the material and the question is and that's a general and the uh the question is how can we get a better understanding so it's not only about prediction we need to understand things of course in the case of industry doesn't matter which just predict because a better prediction gives us more income right that if we want to understand science then you know we we we shouldn't have horses in this nausea and I think there's a lot of open questions and 1 of the open questions that that and we also tackle in billion this is how to bring this very nice data-driven modeling technique to get that gather with the mathematical modeling world because although I have given my talk from this perspective just to be a bit provocative here because you do the other part so so in principle like 1 could say well you know that's the only thing that we need this data and then landed no need for models but this is nonsense because at some point we need to go back and have models because we need to have some understanding and and this is something that is not this path is not well found research on and I think it needs to be researched thank you thank you

00:00

Kernel <Informatik>

Punkt

Inferenz <Künstliche Intelligenz>

Maschinelles Lernen

Aggregatzustand

Textur-Mapping

Raum-Zeit

Physikalische Theorie

Computeranimation

Kernel <Informatik>

Rechenzentrum

Virtuelle Maschine

Datennetz

Vorlesung/Konferenz

Algorithmische Lerntheorie

Diskrete Wahrscheinlichkeitsverteilung

Lineares Funktional

Raum-Zeit

Ruhmasse

Support-Vektor-Maschine

Arithmetisches Mittel

Mapping <Computergraphik>

Rechter Winkel

Grundsätze ordnungsmäßiger Datenverarbeitung

Mathematikerin

Kantenfärbung

Aggregatzustand

03:32

Kernel <Informatik>

Subtraktion

Hausdorff-Dimension

t-Test

Vektorraum

Raum-Zeit

Physikalische Theorie

Computeranimation

Entscheidungstheorie

Kernel <Informatik>

Vorzeichen <Mathematik>

Merkmalsraum

Bildgebendes Verfahren

Metropolitan area network

Nichtlineares System

Parametersystem

Lineares Funktional

Statistik

Pixel

Zehn

Support-Vektor-Maschine

Mustererkennung

Arithmetisches Mittel

Mapping <Computergraphik>

Polynom

Datenfeld

Funktion <Mathematik>

Physikalische Theorie

Ein-Ausgabe

Ordnung <Mathematik>

06:08

Fehlermeldung

Dualitätstheorie

Datennetz

Kerndarstellung

Hyperebene

Optimierungsproblem

Vektorraum

Extrempunkt

Support-Vektor-Maschine

Variable

Computeranimation

Virtuelle Maschine

Koeffizient

Datennetz

Merkmalsraum

Optimierung

ART-Netz

Neuronales Netz

06:59

Gewichtete Summe

Merkmalsraum

Punkt

Selbstrepräsentation

Raum-Zeit

Computeranimation

Kernel <Informatik>

Mooresches Gesetz

Maßstab

Vorzeichen <Mathematik>

Mustersprache

Nichtlineares System

Lineares Funktional

Parametersystem

Zentrische Streckung

Approximation

Datennetz

Gruppe <Mathematik>

Güte der Anpassung

Ein-Ausgabe

Rauschen

Mustererkennung

Beweistheorie

Perzeptron

Information

Schätzwert

Gewicht <Mathematik>

Zahlenbereich

Sprachsynthese

Term

Virtuelle Maschine

Graphikprozessor

Variable

Informationsmodellierung

Datennetz

Zusammenhängender Graph

Algorithmische Lerntheorie

Optimierung

Datenstruktur

Grundraum

Informatik

Schreib-Lese-Kopf

Schätzwert

Tropfen

Mapping <Computergraphik>

Hilbert-Raum

Quadratzahl

Loop

Mereologie

Trigonometrie

Neuronales Netz

12:48

Schnittstelle

Telekommunikation

Bit

Subtraktion

Kontrollstruktur

Gruppenkeim

Zahlenbereich

Softwareschnittstelle

Computeranimation

Karhunen-Loève-Transformation

Virtuelle Maschine

Stetige Abbildung

Informationsmodellierung

Umwandlungsenthalpie

Lesezeichen <Internet>

Multivariate Analyse

Mathematische Modellierung

Translation <Mathematik>

Algorithmische Lerntheorie

Analysis

Gruppe <Mathematik>

Vektorraum

Support-Vektor-Maschine

Quick-Sort

Linearisierung

Arithmetisches Mittel

Echtzeitsystem

Rückkopplung

Gamecontroller

Translation <Mathematik>

Projektive Ebene

Aggregatzustand

Neuronales Netz

17:09

Demo <Programm>

Bit

Punkt

Gruppenkeim

Computer

Kartesische Koordinaten

Computeranimation

Videokonferenz

Internetworking

Eins

Programmierparadigma

Quellencodierung

Gerade

Metropolitan area network

Zentrische Streckung

Sichtenkonzept

Datumsgrenze

Kugelkappe

Rechter Winkel

Projektive Ebene

Ordnung <Mathematik>

Aggregatzustand

Cursor

Standardabweichung

Subtraktion

Wellenpaket

Wellenlehre

Mathematisierung

Zahlenbereich

Interaktives Fernsehen

Unrundheit

Softwareschnittstelle

Term

Code

Virtuelle Maschine

Kugel

Spieltheorie

Notebook-Computer

Maßerweiterung

Grundraum

Gammafunktion

Touchscreen

Beobachtungsstudie

Soundverarbeitung

Quick-Sort

Künstliches Leben

Videokonferenz

Modallogik

Energiedichte

Echtzeitsystem

Gehirn-Computer-Schnittstelle

MP3

Mereologie

Gamecontroller

Codierung

25:44

Resultante

Bit

Wellenpaket

Punkt

Gewichtete Summe

Physiker

Blackbox

Formale Sprache

Maschinelles Lernen

Gleichungssystem

Diskrete Fourier-Transformation

Computerunterstütztes Verfahren

Quantenmechanik

Computeranimation

Netzwerktopologie

Virtuelle Maschine

Informationsmodellierung

Prognoseverfahren

Dichtefunktional

Theoretische Physik

Optimierung

Wellenmechanik

Approximation

Kategorie <Mathematik>

Raum-Zeit

Frequenz

Mereologie

Zeichenkette

Neuronales Netz

30:04

Matrizenrechnung

Kernel <Informatik>

Subtraktion

Selbstrepräsentation

Ungerichteter Graph

Kombinatorische Gruppentheorie

Element <Gruppentheorie>

Computeranimation

Kernel <Informatik>

Informationsmodellierung

Bildschirmmaske

Prognoseverfahren

Lineare Regression

Algorithmische Lerntheorie

Fehlermeldung

Lineare Regression

Systemaufruf

Prognostik

Ähnlichkeitsgeometrie

Primideal

Vektorraum

Ein-Ausgabe

Abstand

Forcing

Selbstrepräsentation

Energiedichte

32:08

Autorisierung

Perspektive

Subtraktion

Bit

Approximation

Gewicht <Mathematik>

Kategorie <Mathematik>

Computeranimation

Virtuelle Maschine

Energiedichte

Informationsmodellierung

Prognoseverfahren

Perspektive

Dreiecksfreier Graph

Energiedichte

Kantenfärbung

Neuronales Netz

33:37

Punkt

Wasserdampftafel

Klasse <Mathematik>

Raum-Zeit

Computeranimation

Kernel <Informatik>

Richtung

Eins

Netzwerktopologie

Spezialrechner

Virtuelle Maschine

Variable

Informationsmodellierung

Prognoseverfahren

Trennschärfe <Statistik>

Gruppe <Mathematik>

Lineare Regression

Total <Mathematik>

Endogene Variable

Algorithmische Lerntheorie

Bildgebendes Verfahren

Nichtlineares System

Äquivalenzklasse

Prognostik

Ein-Ausgabe

Entscheidungstheorie

Linearisierung

Datenfeld

Erhaltungssatz

ATM

Kategorie <Mathematik>

Ordnung <Mathematik>

Pixel

Neuronales Netz

36:23

Resultante

Computeranimation

Eins

Spezialrechner

Total <Mathematik>

Neuronales Netz

Schreib-Lese-Kopf

Äquivalenzklasse

Pixel

Kategorie <Mathematik>

SIDIS

Stellenring

Prognostik

Linearisierung

Mapping <Computergraphik>

Taylor-Reihe

Benutzerschnittstellenverwaltungssystem

Erhaltungssatz

Mereologie

ATM

Kategorie <Mathematik>

Wärmeausdehnung

Ordnung <Mathematik>

Pixel

Neuronales Netz

38:16

Interpretierer

Gewicht <Mathematik>

Datennetz

SIDIS

Ausbreitungsfunktion

Prognostik

Term

Physikalische Theorie

Computeranimation

Erhaltungssatz

Knotenmenge

Gewicht <Mathematik>

Mereologie

Helmholtz-Zerlegung

Wärmeausdehnung

SIDIS

Neuronales Netz

Message-Passing

Funktion <Mathematik>

39:28

Kernel <Informatik>

Fehlermeldung

Subtraktion

Shape <Informatik>

Spider <Programm>

Gewichtete Summe

Prognostik

Vektorraum

Computer

Objektklasse

TLS

Computeranimation

Mapping <Computergraphik>

Spezialrechner

Informationsmodellierung

Softwaretest

Datennetz

Trennschärfe <Statistik>

Abstrakte Zustandsmaschine

Stichprobenumfang

Computerunterstützte Übersetzung

Algorithmische Lerntheorie

Maschinelles Sehen

Pixel

Fehlermeldung

Neuronales Netz

41:15

Datenhaltung

Physikalismus

Klasse <Mathematik>

Mathematisierung

Computeranimation

Spezialrechner

Virtuelle Maschine

Informationsmodellierung

Prozess <Informatik>

Theoretische Physik

Datennetz

Mereologie

Kontrollstruktur

F-Test

Modelltheorie

Fehlermeldung

Benchmark

43:04

Rückkopplung

Telekommunikation

Abstimmung <Frequenz>

Subtraktion

Gewicht <Mathematik>

Selbstrepräsentation

Formale Sprache

Interaktives Fernsehen

Gleichungssystem

Computeranimation

Kernel <Informatik>

Informationsmodellierung

Tensor

Datennetz

Neuronales Netz

Analysis

Topologische Einbettung

Schießverfahren

Matrizenring

Datennetz

Kategorie <Mathematik>

Stellenring

Profil <Aerodynamik>

Symboltabelle

Vektorraum

Quellcode

Kontextbezogenes System

Energiedichte

Polarisation

Tensor

Injektivität

Selbstrepräsentation

Neuronales Netz

45:34

Nachbarschaft <Mathematik>

Informationsmodellierung

Wärmeausdehnung

Graph

Raum-Zeit

Gruppenoperation

Interaktives Fernsehen

Tangente <Mathematik>

Computeranimation

46:15

Stabilitätstheorie <Logik>

Punkt

Physiker

Blackbox

Physikalismus

Gruppenkeim

Quantenmechanik

Raum-Zeit

Computeranimation

Task

Perspektive

Zeitrichtung

Informatik

NP-hartes Problem

Schnelltaste

Datennetz

Kategorie <Mathematik>

Datenhaltung

Gasströmung

Quantisierung <Physik>

Energiedichte

Datenfeld

Computerarchitektur

Information

Simulation

49:59

Physikalischer Effekt

Bit

Punkt

Materialisation <Physik>

Datenhaltung

Stichprobe

Gruppenkeim

Prognostik

Computer

Aggregatzustand

Datenmissbrauch

Computeranimation

Informationsmodellierung

Prognoseverfahren

Datenmanagement

Rechter Winkel

Perspektive

Offene Menge

Theoretische Physik

Physikalische Theorie

Mathematische Modellierung

Mereologie

Decodierung

Algorithmische Lerntheorie

Schnittstelle

Aggregatzustand

### Metadaten

#### Formale Metadaten

Titel | Machine learning and applications |

Serientitel | The Leibniz "Mathematical Modeling and Simulation" (MMS) Days 2017 |

Autor | Müller, Klaus-Robert |

Mitwirkende |
Weierstraß-Institut für Angewandte Analysis und Stochastik (WIAS) TU Berlin, Institute of Software Engineering and Theoretical Computer Science |

Lizenz |
CC-Namensnennung 3.0 Deutschland: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. |

DOI | 10.5446/21906 |

Herausgeber | Technische Informationsbibliothek (TIB) |

Erscheinungsjahr | 2017 |

Sprache | Englisch |

Produktionsjahr | 2017 |

Produktionsort | Hannover |

#### Inhaltliche Metadaten

Fachgebiet | Informatik |

Abstract | Since a few years Machine Learning (ML) has broadened the modeling toolbox for the sciences and industry. The talk will first remind the audience of the main ingredients for applying machine learning. Then various ML applications in the sciences namely Brain Computer Interfaces and Quantum Chemistry will be discussed. |