Merken
The Joy of Simulation: for Fun and Profit
Automatisierte Medienanalyse
Diese automatischen Videoanalysen setzt das TIBAVPortal ein:
Szenenerkennung — Shot Boundary Detection segmentiert das Video anhand von Bildmerkmalen. Ein daraus erzeugtes visuelles Inhaltsverzeichnis gibt einen schnellen Überblick über den Inhalt des Videos und bietet einen zielgenauen Zugriff.
Texterkennung – Intelligent Character Recognition erfasst, indexiert und macht geschriebene Sprache (zum Beispiel Text auf Folien) durchsuchbar.
Spracherkennung – Speech to Text notiert die gesprochene Sprache im Video in Form eines Transkripts, das durchsuchbar ist.
Bilderkennung – Visual Concept Detection indexiert das Bewegtbild mit fachspezifischen und fächerübergreifenden visuellen Konzepten (zum Beispiel Landschaft, Fassadendetail, technische Zeichnung, Computeranimation oder Vorlesung).
Verschlagwortung – Named Entity Recognition beschreibt die einzelnen Videosegmente mit semantisch verknüpften Sachbegriffen. Synonyme oder Unterbegriffe von eingegebenen Suchbegriffen können dadurch automatisch mitgesucht werden, was die Treffermenge erweitert.
Erkannte Entitäten
Sprachtranskript
00:00
check by Vincent when so everyone
00:10
thanks for having me over the US
00:12
legal status some about um that act against their about the joy of simulation homonyms Vincent come from Amsterdam over going to a new beginning there is a need to get together to laptops and others 1 website while you're doing
00:26
that Morgan discuss various programs is and what isn't that I'm going to explain you guys have sampling can actually used to you know do a bit of inference this nice I will then demonstrated couple experiments them with sampling and then explain to you how I derive some better tactics for monopoly using simulation I will explain to you how I found out that you can sell it goes on and that like many figures in the very core of the property that I could go on and talk about how sampling can be used as optimization tactics and I'll conclude by talking about how we can outsource creativity that by using sampling and then I'll talk about Symposium on related subjects
01:02
which somehow blends everything together so grammars before talking about what randomness this is going to be
01:08
sampling we should be sure that we understand what grammis this isn't because you know we're humans and
01:15
computers nowadays 10 view of a better understanding members and we so it's a good old please go to the website that I
01:21
just told you about ever going to go ahead and do a bit of an inverse Turing test so
01:28
this is the the website this to be my blogs and then this is 1 of a blog post called human entropy please go there right now agency website that's somewhat similar to this
01:39
and you could read it but the idea is we're gonna go ahead and try to generate random numbers so put your index finger on 1 of the index finger on 0 or use these 2 buttons you will notice that if you click the number will increase I just go ahead and do this and I'm just generate a bunch of random numbers and try to generate them as randomly as you can but let's generate about 100 and I'm going to go out and generate a few more energy than there is a couple more seconds in the experiment itself so
02:21
I almost got about 200 members we consider JavaScript's slowing down from the top to that 1 so it is generated by 2 numbers ideas you and said
02:32
OK this 0 0 1 1 1 0 0 0 1 1 1 0 let's see what I've if I scroll down what I will then C is also all these histograms of how often I think the 1 and how often a bit 0 but also often I picked a 0 after 0 and 0 after 1 etc. etc. etc. and what you know this is even on trying to become kind of random I'm trying to make as many ones as I am trying to make as many zeros that you will notice that I usually fall into this pattern where do 1 after 0 1 0 or 1 and is very normal as a human being to that feels random even though it totally is and this small whether you know you can read of the map you want it tries to do a real time prediction of what you're going to type in next and you can also track how often with the accuracies of just below the page that would be more of a realtime
03:21
thing so 0 0 1 on your original you that that the the mother animal and uh then you at the bottom you can see the probability of me being the humanists of me being a robot is quite low so you know I like this idea of human inference gets in the way this is an inverse Turing test by taking if you can actually generate random numbers I'm actually elicited you're definitely not a computer and so this is a
03:49
useful way I hopefully will quickly explain the guide how randomness works and how it doesn't work
03:54
and it is useful to have some form of randomness available to you but you as a human simply uninterpreted generating it therefore we're going to use a computer instead for the whole thing in the
04:06
playing with this to 0 0 1 this really 1 after the other you see
04:12
that there is this estimator tries to predict and what I'm going to generate next and you will see that at some point a probability of me getting in operating 1 this no forces someone and 0 probably switch but if I now switch my algorithms are moving and generate zeros now you will notice that there's a bit of confusion but after a while it has picked up my new pattern and the accuracy goes up again the only ones now you will be displayed with the same pattern the cool thing about this is because I have these laws of probabilities my disposal or use a little bit of math all these predictions but still useful to have a random sample so
04:52
I have a small demonstration hopefully it's obvious that you mentor be generally is quite terrible is where prefer use computers to help us think about probability and because we have a circular available to us because we can sample quite fast and we can kind of avoid doing a little bit of map and this you know matters
05:09
hard and even though but it is very useful we do like just get the job done so the goal of this talk is to convince you that you can do a lot of these tasks just by getting um sample and I guess the
05:23
easiest way to explain so the from all in perspective samples are useful and sometimes we can show we know the characteristics of a system that we want to know the likelihood of a certain event happens and again might be easier to use sampling set of maps the inference for us and simplest example I could come up with this suppose you have a lot of dice you roll the dice and then there is a probability that a certain number of ice proper and I could calculate that I mean you know this log probability
05:49
is applicable to just go and do is I can draw are uh with . histograms and this is the histogram about 1 guy is this histogram provide 2 guys this is required for you guys etc. and circuit is nice probability distribution of that I don't have to do any math for negative sample the nice thing about this is I can ask this thing to questions suppose a for guys what is the probability of getting a certain number of eyes some like also ask a different question I can also ask given this number of ice how likely is it that I have been rolling a dice I know the rules of the system I can describe them from
06:27
there like sample that means that I don't have to do math but I can still do inference on the so it's sort of like I look at it from this direction but most look at from this direction this is a powerful thing it is something of agents and the like a lot
06:42
introducing this kind sampling by the way uh please consider looking at this library called finds the 3 quarters other library called the and see the sampling methods for inference of very powerful so to theoretical for this topic the interested there's a very nice tutorial my blog which explains how you can use some sort of time series analysis with the sampling techniques as well as
07:01
service doing inference and dies particularly inference and but anyway
07:06
and let's consider a fun example of how actually got better at doing something because I had this computer available like due to the fact that do we know this game
07:16
yes do we also always play game during Christmas for you as a from my dad always makes me play this game during christmas and I absolutely despise again I don't see any joint whatsoever so I figured you know how about it might be fun I enjoyed playing the game I could at least enjoy being my dad so the idea would be can sampling a little bit to get better
07:37
at this game because if I think about it uh every tile on this board is worth something and I can calculate the expected value with only new with probability was that it actually such a time OK so math wise this would be a little bit hard to shift the foremost oracles of just do what I can say OK let's just for 10 thousand times I know the rules of the system and the rules of the game together from the Web was set 1 start here and just start rolling dice and use the rules of the game this shape for the long jump over the years as a lower level of certain types is a very interesting characteristic of this game because the likelihood of being around this corner of the board is quite high Because his go to jail so that is
08:18
in a simple way than including the cards but I didn't know what this
08:21
grammar that looks like this so at the xaxis you'll see I to the number of the tiles is at the time
08:28
of the 0 this is time number 3 9 and you can see the likely you
08:33
landing somewhere and you'll notice is all the spike in which coincides with the jails of the year 1 of the
08:40
most interesting thing that you'll notice is after jail that seems to be a slightly higher likelihood to be at 2 steps away from jail 4 steps away from jail 6 8 10 and 12 steps away from you and the reason for that is because it's unlikely that you get in dual you wanna get out it's really after all the same guys points this why is more likely to land 1 of these areas and if you know that beforehand you can change a tactical of for example and
09:08
I know there's a lot of randomness in the game but if I were to choose a station on this board it becomes it seems more relevant that I would take this station and that station and I were given the choice and
09:17
now I can calculate how much likelihood is directly land there right in the 1st station this is the 2nd 1 is the 3rd 1 you want 1 and you can see that is actually it's not think twice as likely but there's a bit of um by experimenting in have potentially use and again this
09:38
is something that for the spaces is stationed generates amount of revenue but you can scrape monopoly of comments on that and actually get the amount of money out and that you can get if you land on 1 of these places and this is an ice table which is sort of cool but and obviously you just go and plot this so for every tology combined you'll see
09:57
at points listed here uh this is the probability that you will land on sets file and is the rancid you can charge of someone lands on the pilots the size the point would be the expected value so the point is very big that means on average and after the game that the trial will generate more revenue from here and I think that you need we notice is there seem to be sort of an efficient ISO curve that everything on this imaginary line seems to be worthwhile called down at the bottom there's a couple of you know not really performing all sparsity combined can anyone guess which part
10:32
of problem that these problem of these that would be added to big dust so those would be these 2 Bosnia landing they're quite small but if you actually land boy are you in trouble and so the risk by its kind of thing it's very risky which approach has a lower probability of landing here soon because the go to thing happens for and economic drawing with the with the point but you see this pattern also if you bought a house there about a 2nd so I'm not
11:08
saying that I actually got much better at playing this game and the but uh I don't understand the game a whole lot better and I did didn't really measure how often I want and but I like the fact that just by using a load of sampling actually understand the game all of that this I know the mechanics of the system but then I I can know collect data by sampling and suddenly I understand the game a lot but it turns out this is also on the blog if you're interested in this blog posts during I can use for a while and the kind people on the Internet and point out all the weaknesses of and there was 1 guy and that's sort of the floor at your modelling always I think this is all all start with the obviously it doesn't encompassed everything in the game so 1 person in particular was very adamant and said look if you wanna when you should buy these places these are the ones you
11:55
want to have and the reason is there's this mechanic in the game that if you buy all house 1st no 1 else can buy houses to you're the person actually owns all the houses here no 1 can invest in houses on many of these laws and then suddenly do become more valuable and in week we could go a little further on we could go and you know going depth and how that would work and a colleague of
12:16
mine actually builds a bit of software around
12:19
this and you can send your own genetic box to play this game with a
12:22
certain strategy that talk to me on about this topic and this was definitely that I thought was fun and I understood the gain better turn out to be a nice blog posts and was a nice example of when sampling of something that's yeah but it also
12:36
this is where the crazy example but there's also some places where in practice I wanna make a living and I might actually make more money if
12:45
I just do the inference by sampling and the best example that I have bodies legal many figures
12:50
that are very familiar with Lego many figures a show of hands on the great so it was a smart company uh Legos at some point realize like a if we combine star wars and Legos they've got to collectors items in 1 and there will be more people will be willing to buy so collectors items you can make a lot of money and that is the original Lego
13:11
many the disease and like I mean think is a kind of like the king the boy no you open up the packet but you don't know which little many figures in there but there 16 and sets and after 2 months of so they never going to produce that again I OK that sounds interesting and would there be a market for this so we go to this 2nd hand website
13:29
newspaper little data and you make a small little histogram you wanna try to see if you supposedly we invest in of many things at the moment it will be profitable suddenly this is again you get out which isn't that pretty so I don't really have an impression of what the
13:45
average price might be there seem to be a lot around here but there's a thick tale of there as well so that we get a good impression what the average might be so although the bootstrapping say that I have 60 these prices and it's going to graph theory of random and calculate the average repeated over and over
14:00
and over again more smooth curves like in visually interpret plot with that and turns out if I this is for the Simpsons there many figures that I was looking at these are for the other ones at the time the centers like many Figure 1 was the most recent 1 and I can imagine that as the series gets older it'll be worth more so that this might be a good habit to look at and you should always look at averages but this seems reliable enough so these are the figures
14:28
that I can buy a little mini figure for 3 years of peace and I can still awful sense for 100 euros later how likely is it to get the full set ignorant I figured I do this with mass and when I say I am going to use of matter the programmers where map so you go to
14:44
math overflow and said the question and they get an answer back what is even more complicated turns out that this thing called Sterling said numbers which is no insurance in this in theory and solve the problem so instead of looking at math or even cycle for the comes to the rescue I should probably just in our simulated so it
15:03
it you see the number of packets uh that I would buy in the line itself shows you know the probability of getting a full set but the bell that line shows the expected number of total sets that I would have but obviously if I buy 100 packets you the odds of getting at least 1 full satisfied great but I might actually 2 sets so rather likely and again whenever doing these sorts of things is always nice to visualize once in a while cause if you visualize something they can get surprised is that when I was looking at this and it sort of made wonder G. if I have 1 said and I start clicking even more from the 1st set I collected probably have some spare legal meaning which I could probably used to make sure that the 2nd set that by is actually easier to collect so I think that thought that I similar is over and over and over again and the blue line you see here is again the number of packets there's been a number of sets and I would have this shows the average amount of time that you might need to get 1 that theoretically you need
16:09
about 50 genes alright but the time it takes here is much more than the time it takes
16:15
here which in turn is more than the time it takes here which in turn more than 1 this here so the moral like I the higher the likelihood is a lot more sense and again uh this seems intuitive but it's because I've done the inference that was able to figure this out so again sampling is very useful and in Amsterdam often
16:39
give this course in probability theory so this seemed like a nice example and and the given this cost you a bunch of bankers and in the latter part of the afternoon we have we opened 1 of these boxes to see the inference was correct and it turns out if you open the box up and you will always
16:54
have 3 sets even randomly distributed was still from quality sites so that means that the
17:04
obvious use case we think of simulation use simulation because 1 of the probability theory but it's actually other fields and just probability theory that can now nice benefit from
17:13
doing a good simulation exercise so this
17:15
talk about more general use cases let's talk about optimization in general and give the idea of how it works and going to give a slightly
17:21
silly example so this is an example where we know the correct answer beforehand but suppose we have a 1 by 1 square and we want to find the largest triangle in this 1 by 1 squared and again this is a very silly example we know what the largest triangle is just now making actual you're done but let's say this is a system that we want to optimize the computer has no notion of what the best parameters are registered in finding 3 points and the area in between these 3 point has to be the biggest as only thing that's known to be computed this nominee map so what's
17:55
a person can do well how about we just you know generate a whole bunch of random triangles and pick the best 1 see how good that just
18:05
wandering in from the cold that sort of means you just generate a whole bunch of random values in 6 dimensions x 1 x 2 x 3 y 1 y 2 y 3 those are the 3 points and then this function called a choice function which then calculates what the area between all these points and you do that for a whole bunch of things brother itself and you know the largest triangle the 0 . 5 and the probability of actually sampling a
18:32
triangle from this region is rather low and this doesn't surprise so what can we possibly do well
18:39
what I could do I could say gee um I have an area and I have all of these x coordinates how about I throw away all the bad triangles that but I can I can take the average of all the triangle sizes and is there a way that the happens at the bottom I could keep the ones that are employed and instead of looking at the area how held the look at the distribution of the points there
19:05
is a distribution there and as luck would have it and if you have a sample size is rather big what you can do is you can give I could learn this I could learn a nice density estimator so that's what you visualized here this is the density plots for
19:19
the points performing well and to just sort of assuming that these
19:24
are this X 1 and X 2 and X 3 and it seems that if and when it gets to its triangle the idea that x 1 has to be alone number or a high number and if X 1 is the low number the next to has to be a alone number as well or a high number
19:38
this coincides with my belief would be triangle should be used this sort of nice but only if they do this not only can I maybe get a sampling technique to give me better triangles but I can understand the problem of the better as well just like a monopoly in just like the Lego example that is what the X and Y
19:57
distributions look like again the figure
19:59
plots this is just the x is 0 otherwise the z axis and the whites
20:02
together and when I see sort of makes sense I like it if my X is smaller than what was that makes complete sense of that right so I
20:14
have a distribution of sample from this 1 so the idea is on and
20:18
I've selected areas of big winner so and sample from that distribution instead this is what
20:24
we had before and if I do a new sample that I that I just learned all suffer from this distribution
20:29
and again would you will notice it's sort of like the year uh and then I'm gonna sample larger triangles so how about I just repeat this example of a sample of a sample of examples and the and the nice
20:42
thing is we can repeat the same idea some sort of convergence pops note that the mathematically
20:47
I've said they were going to select the areas in a larger than some some sample some M and you
20:52
could take be the average you could pick whatever other metric political bonuses that from the you like this that use inference on a simulated data to learn more about the nature of optimization problems I have a familiar of genetic algorithms you may notice that they actually work in quite similar fashion except here I'm trying to look at the distribution of the parameters resident algorithm use a similar tactic to do a proper search the entire grid to be interested in learning more about this and there is this colleague of mine he's sitting there we'll talk about this sort of thing more in detail later this week at a specific we'll talk about how you can win the board game Risk for how to conquer the world has the final say it's Thursday at 12 o'clock and term room and I'll talk more about genetic out general manager applies to many many things is what probably have noticed
21:40
as well this is triangles but I can sample anything it's continues on discrete is just a sampling exercise which means have a very flexible way of optimizing any system as not the same that I always find
21:52
the best solution is a proper way to do a form of search and
21:59
hopefully by now I've convinced you that sampling is indeed useful and it can be a little bit surprising and its use cases was going to talk about now is sort of my hobby project for the next
22:08
year I think and it's the thing I'm most interested in that moment and those generative methods and the thing I like about them is they should allow you to outsource creativity and entropy has a small role to play in this so so for this
22:22
reason the static and in the next little introduces the more Markovian way to think about randomness so far I've said this distribution given a sample in the sample and it wasn't really the case that the sample it I just got the terms a little bit in the next step
22:39
and this is literally what is listed all my LinkedIn profile and if you please spend a little time actually reading it some of you may understand the job it seems surprisingly relevant few people people with with what you
23:06
couple people understand the joke so if from and you wanna write down what what's the stuff you do so I got Popeye and script and so the key word being a little place and with this you know nice but to get these nasty recruiters on your profile which which tend not like so I
23:23
figured I'd be fun to just add a couple Pokémon in there as well so there is no load lowtax level as actually think and text gets are so again the main the main reason is I like to give you this on the mean promised in the that there's nothing more
23:39
fun recruiters says they would you come to work for my corporate bank and you can and say well you guys you spot it you and applied and and recruiters says yes obviously all signal to the recorder in question would be entirely wrong reddening occurs the terrible classifiers for protein 1
23:57
and recruiters can really extinguisher put my name presenting of technology so I figured making a Python library that can generate putting my name's might actually be fun and so the libraries called gravel it's totally not done yet but the idea is to have techniques for service and before you start thinking gee that's ridiculous Vincent
24:13
and speaking of Pokémon segmenters is again repo
24:17
turns out a lot of people have useful tumor names for their original get help projects and there's a link in the presentations if you look at the fine but the latter part is actually a promissory begin technology but it's a templatebased robot dynamics library and there's this little web page that for every 750 Pokémon tells you link to get out of with no improvement package
24:41
practices the reason I wanna a global is I've always been a user of libraries that are really with myself from the problem seemed interesting enough uh almost like you learn from doing this so of how we would like to about some some called the Gnostic painters servers or something and but so the idea is to generate names that sound like poking and to put it bluntly the whole point of gravel
25:04
was to come up with name and then sad
25:09
thing about OK so I have this knowledge in a Pokémon entity with interesting problems um but it involves generating a believable sequence of tokens and you think about it I could do this a Pokémon names but there are many other things I could do it well like reduction couple limits or ikea furniture names or notes on the piano um so the simplest model that you could possibly think of say OK suppose I have some tell all this is independent of its a letter in the word or if it's a word in a sentence or of its amino in entire arrangement or if it's
25:44
um and you know it's the sound in a few furniture um but the idea is once I know the previous token I might have some other probability distribution for the next Markovian thinking is all about depending on the state of and now I will sample from a different distribution for my next token
26:03
and this is the simplest model um if I see the letter A. beforehand uh probably the chance of seeing that evolves the somewhat smaller as constant might be somewhat larger and but you can also do it just
26:15
for the last over the coming before the use of generating a Markov chain into gonna learn then you gonna
26:20
try what and the basic way of doing is Cervantes Markov chain it I can look forward maybe I look back at that as well so if i'm generating the 3rd tokenize should look at the 2nd token and the first one so it's a Markov chain of length 2 in sense um but if you really want to this is just a way of thinking about it uh and if I know t 1 and this solution for T set this of the sampling rule if you will but who says it has to be ordered in 1 direction I can also model this in a way that you can model it from 2 directions if you think about it the uh and it seems fine to start with the 1st letter of the Pokémon name because is like prior belief that you could start with a certain letter but is probably a different distribution for the last letter so it doesn't seem entirely and sensible that document probably don't end with a letter I for example so maybe we should have a Markov chain that goes 1 way for the Markov chain that goes both ways in it seems very sensible and sort of the same probabilistic model so I'm playing around the so does look like if you preview your sample came on and I came up with a real clean cool racist who really got you tail recently until all my elected as the scene rather possible gratitude that results this hilarious uh can you believe hold me please by the way I wonder what the wave man YTD screening and the there's been when I was fortunate I know you must've in fact this year in the sun and the bottom dollar foxhole of applying your house not the spin further like but you can do do this and this the sort of the the lovely joy of doing things like making these sorts of models isn't that this is accurate what is somehow actually again I somehow do
28:05
believe that this came from the culture that and let's say not or something which just sort of noticed is that the corporations in the corpus of relative ability somewhat limited so if I say by the way this obviously always gonna be together with the weight when I think of switching between 1 of the songs so the corpus does have an influence with this sort of works in this exciting sending for Ikea furniture this and I know what you guys but I would definitely like to have debate couch already Europes thing you go
28:39
about but the problem is the reason why this happened was I was talking to my girlfriend about this problem of and she was obviously not super impressed uh really understand time on this instead of making cooking my name's how about we make ikea furniture names this axle an idea anyway uh um the thing is the we have so a nice there's a lot of these are actually quite long and these were a lot of samples that didn't really make and so I really think a is a terrible name for furniture or Pokémon and they all seem to be order and and the answer to that use of the so this is still work to be done and explain on thing about maybe tackling the problem instance of violence from this thing about this and in the Machine Learning Tree very normal to make ensembles of models and so he's making 1 model of ultimate 10 of them and combined in some way it seems sensible that if this model work somehow with this 1 works also evidenced by the way we should be able to combine them in the laws of probability certainly allow for this so that seems like a thing I can do to make this better and I think that I could do was my library has this notion of a lexicon we sort of like a data frame model the model that is having a dataframe need some sort of a bag for sequences of tokens what I could also do is say hey yeah covers lexicon of all the Pokémon names train the covers lexicon of the English language maybe on both the monarchs and sounding and I could go out and combine the 2 and then these are the 2 models that are just 2 different models but also trained on different data but they can actually you know be glued together maybe get sequences and it's a very fun experiment my in my mind to generate Pokémon names of some French or put money into that some German and like also do was maybe do something that transcribers on user the put names it seems very obvious that instead of just focusing on this letter came after that letter I might actually give some domain knowledge and say this letter it happens to be a constant and the odds of getting the constants after each other is rather low so then I would translate uh so that's a mapping uh the lexicon something else into another model of coincides this seems like a viable tactic is you can sort the lecture domain knowledge among which can
30:52
also do is you try that judges um how about generate 100 samples on this 1 was based on this 1 model and that is the model that would judge this model to fund hundred samples I will sort them all on the take the top 10 also seems like an appropriate way of thinking about that because I think more this is the sort of hardcore but can also think about it not in a summer cold in way but they need to generate a factor graph instead this has some nice benefits and even we can even take these Levenstein each approach where I started with a Pokémon name and 1 by 1 randomly I change a letter also way to generate something it's the the sun possible and the nice thing about doing it this way is I not necessarily limited to just probability theory but I can use jensen's water that cover them here and then you can sort of wonder hey uh what a word is in the sentence there some context and what the letter and it's like a opening sequence most of the interesting way in which you have like talking to that because of severe these deep models which is fashionable nowadays and the way that workers you would say either starts tokens and absent prior beliefs to generate that then sample all the way down to a season should stop token that this is nice but it has some sort of have some features that I would like to have which is this way of thinking of necessarily support and suppose I wanna generate Pokémon name with 6 letters and I know the 2nd 1 and 6 1 pomegranate
32:13
generate the this is the so this is what makes the problem hard too much you're going to give support to this but if you are deep learning specialist I think you are you know the solution to this we can talk to you about here what you think about it so I'll focus on the following domains
32:28
for was a graphical models seem alright heuristic approach indeed learning seems right it's very interesting to see how the people converging is that nowadays you can use a neural net as a generative algorithm which takes some sort of Gaussian and and generate some sort of generative distribution out this from the opening I hate the approximately what we have research on product like this and the roughly the
32:49
planets so the main lesson ahead of designing this instead just writing
32:52
code in a notebook it seems like a better idea before you write any actual code um you should wonder what does the U I look like I'm going to be the user angrily using is a lot of different lexicons of the models of the train was the best way for me as a user to play around with this so didn't you just copy so I can learn um the there's some notion of a data frame and is scholar lexicon there are these different models to assess factor graph to a Markov chain etc. This is what defines the properties of the model regulator fit that on a separate lexicon and once those models are trained I can generate from them and the hopeful ideas I can somehow make the ensemble based on 2 different models maybe even given some waiting and then use and sort of the the outcome of this was some sort of judging characteristic of model simply writing this down for myself personally been working at this middle of clear for me to
33:43
know what I should building how should we design and hopefully made by the end
33:49
of next year high come back and talk about a model that can make Pokémon names where you say hey I don't know this token and another 1 and that 1 by 1 the tokens and with base especially in the market for making at the Technion very useful and another thing is in sense
34:07
of this is the sort of the deep neural meaning
34:09
dream of building something that works tokens for art as well
34:13
um this is actually something that doesn't use any entropy but this is something that you can generate fairly easily in just a couple of lines and JavaScript that you're interested in there is going to come on friday to my then the talk will make pretty image is sort of like this but in 3 D that the goal of the talk will just simple human if you interested in talking to build pretty stuff that looks like this so concluding
34:38
essentially can be a lot of fun and sometimes actually profitable Lego example Getting Started super easy manage be surprising how often it can help you out so again people don't necessarily always understand finds 3 because it's a little less Streep fought on the side which are very flexible you can describe the system a sample where you can actually solve a lot of problems python is a great usecase language for this it's actually surprisingly easy to be very flexible just a couple of lines of code um and also quite fast as 1 of the great and and it never considering thing about API is something equivalent from our community and try to optimize for choice you're going to be using the library maybe and like the it's
35:16
better to spend an entire day things be in the i would like to do you most want to start using set of having amounting to climb in order to come
35:25
from the language we use perspective just fall into the France was the idea that take any
35:41
questions but anything poking unrelated that's not about the game promoters talking environments and really since no
35:57
question born president that thing the
36:03
lecture on the on the on the also if you done something similar to this also come talk to me this is sort of the project but I'd be happy to hear someone other than to points in similar I thinking is that this beautiful little bit yesterday and I question is how much of the program question but if you had a simulated 10 network of cues hence the human you know and how would you do that so there's too few
36:37
selected each other citizen network can maybe this like yeah that so the use case for me is definitely a problem right because it during a meeting track portal talking and then this is the point of a little there main gist of what I'm trying to get to at some point is this is using the music of music would be more difficult because then let's say you have a base that's playing of drums that's playing and then you have the melody for example when you have 3 sequences actually have to correlate somehow um but in this case don't really fit into this model and what you could do then is uh there's a couple of libraries have a bit of support for this but you can have going to probabilistic graphical modeling in the weights and the trick is events say that's just to try to you this 1 trick we can say these are the 3 tokens that I can see as 1 separate token and that will go into this 1 lists and I'll try to generate that uh and this can be you might need a lot of data before you actually get the pattern right little expensive unless the other ways not having the complexity in the data but that's been put into the model where say these are 3 time series and somehow correlated learned that and you could do that with sampling approach by 3 are some examples of that then you get sort of correlated time series land and I defined like a whiteboard I get more easily explained it used come to me upwards of that something but it is sort of way part the
38:01
anyone else yes not because of the
38:27
presence and correctly from wrong but I believe the question is hey a this is a nice talk but um obviously when you have more dimensions to be harder right so for the triangle example is reasonably easy can only 6 dimensions and the right uh and so the dimensionality is always initially and believe of the graphic pictures so we suppose we had
38:50
this but in the 12 thousand dimensions obviously it's a whole lot harder and what I would be more to same year is considered as as a approach it actually has a couple of years we use cases and you can actually solve something that especially in optimization field where you have a lot of hills and many many
39:05
dimensions and the reason why we used genetic algorithms is best because the best thing we have the necessary because that's the best thing we could have when you go into random algorithms when you do that because uh and there's no
39:18
alternative this greedy methods for those of them yeah so I didn't
39:25
quite get was why couldn't do like analyst to so there's the thing with the STM is usually you can say hey generate lots lots of sequences that in issue I sort of have with this 1 use case so suppose that you say I wanna have a Pokémon starts with an H then the name
39:42
and then after 3 tokens comes the the ensuing place yes no and I don't want to finish
39:50
sequences but did you like just like you would want to portable so this of the Wilkinson problem with like judges just like just like a bunch of people who will nuisances welcomes of now and and and I have been the thing is um suppose I extremal momentum right so
40:09
but when I put my name learning Learning internal states that the usually I would generate a new sequence would be to say start go and then at some point in sample and something that is done that has a Pokémon knowledge as I could say generate all of them
40:24
until at some point you have a proton in the start of the H then a then freedom and whatever and then the but that feels like over sampling a bit of this and happened this is the 1st and so that this is more of an open problem this is very fun to good academic conferences and ask this to professors that and began this should be a sort of more generic so of 1 neural weight also approach this problem and this might be a generative thing from opening i we actually at the end of the yourself like worthwhile venture and this isn't never really heard anyone say here's an obvious solution to this problem is and isn't like if it's not the most important problem in the world by the rate of occurrence and so I'm not going to angry about the like it is given the group
41:12
also the level of of events of
00:00
Bit
Inferenz <Künstliche Intelligenz>
Kategorie <Mathematik>
Minimierung
Relativitätstheorie
Dämon <Informatik>
Globale Optimierung
Vorzeichen <Mathematik>
Computeranimation
NotebookComputer
Stichprobenumfang
Schlussfolgern
Simulation
Speicherabzug
Optimierung
Figurierte Zahl
01:01
Web Site
Bit
Sichtenkonzept
Web log
Inverse
Inverse
Formale Grammatik
TuringTest
Computerunterstütztes Verfahren
Computeranimation
Softwaretest
TuringTest
Randomisierung
Entropie
Vorlesung/Konferenz
01:38
Schlüsselverwaltung
Zwei
Inverse
Zahlenbereich
TuringTest
Computeranimation
Zufallsgenerator
Energiedichte
Softwaretest
Einheit <Mathematik>
Automatische Indexierung
Zahlenbereich
Entropie
Vorlesung/Konferenz
NotebookComputer
02:32
Bit
Sechsecknetz
Inferenz <Künstliche Intelligenz>
TuringTest
Oval
Computer
Computeranimation
Eins
Homepage
PROM
TUNIS <Programm>
Fluss <Mathematik>
Digital Object Identifier
Prognoseverfahren
Zufallszahlen
Grundsätze ordnungsmäßiger Datenverarbeitung
Mustersprache
Minimum
Fokalpunkt
Anwendungssoftware
Randomisierung
Minimalgrad
Flächeninhalt
Diskrete Untergruppe
Kette <Mathematik>
Roboter
Folge <Mathematik>
Drucksondierung
Gruppe <Mathematik>
Inverse
Datenmodell
Prognostik
Roboter
Zufallsgenerator
Mapping <Computergraphik>
Mustersprache
Generator <Informatik>
Histogramm
Echtzeitsystem
Zustandsdichte
TuringTest
Betafunktion
Zahlenbereich
EinAusgabe
Modelltheorie
MarkovKette
03:46
Bit
Punkt
Mathematisierung
Inverse
TuringTest
Computer
Computeranimation
Eins
Bildschirmmaske
Prognoseverfahren
Algorithmus
Softwaretest
Wahrscheinlichkeitsrechnung
Mustersprache
Stichprobenumfang
PERM <Computer>
Randomisierung
Flächeninhalt
Elektronischer Programmführer
Normalvektor
Schätzwert
URN
Prognostik
Forcing
Loop
Einheit <Mathematik>
MarkovKette
04:49
Task
Mapping <Computergraphik>
Bit
Prozess <Informatik>
Stichprobenumfang
Mathematisierung
Entropie
Computer
Computerunterstütztes Verfahren
Computeranimation
05:20
Diskrete Wahrscheinlichkeitsverteilung
Inferenz <Künstliche Intelligenz>
Stichprobennahme
LikelihoodFunktion
Mathematisierung
Stichprobe
Mathematisierung
Zahlenbereich
Schlussregel
Physikalisches System
Ereignishorizont
Computeranimation
Mapping <Computergraphik>
Physikalisches System
Negative Zahl
Task
Histogramm
Charakteristisches Polynom
Menge
Perspektive
Digitaltechnik
Stichprobenumfang
Rechenschieber
Schlussfolgern
Eigentliche Abbildung
Charakteristisches Polynom
Ereignishorizont
06:25
Algorithmus
Web log
Inferenz <Künstliche Intelligenz>
Stichprobennahme
Mathematisierung
QuickSort
Computeranimation
Web log
Richtung
Reihe
Task
Zeitreihenanalyse
Stichprobenumfang
Programmbibliothek
Schlussfolgern
07:01
Reihe
Algorithmus
Bit
Dienst <Informatik>
Task
Inferenz <Künstliche Intelligenz>
Stichprobennahme
Spieltheorie
Schlussfolgern
Computeranimation
Web log
07:37
Bit
Shape <Informatik>
LikelihoodFunktion
Mathematisierung
Schlussregel
Physikalisches System
Whiteboard
Computeranimation
Chipkarte
Übergang
Benutzerbeteiligung
Wechselsprung
Spieltheorie
Parkettierung
Datentyp
Zählen
Charakteristisches Polynom
Orakel <Informatik>
08:20
Zufallszahlen
Punkt
Parkettierung
LikelihoodFunktion
Statistische Analyse
Formale Grammatik
Zahlenbereich
Computeranimation
09:07
Bit
LikelihoodFunktion
Arbeitsplatzcomputer
Randomisierung
Aggregatzustand
RaumZeit
Whiteboard
Computeranimation
Tabelle <Informatik>
09:56
Punkt
Menge
Spieltheorie
Mustersprache
Mereologie
Schwach besetzte Matrix
Elektronische Publikation
Kurvenanpassung
QuickSort
Gerade
Computeranimation
11:07
Kraftfahrzeugmechatroniker
Punkt
Web log
Last
Spieltheorie
Physikalisches System
Gesetz <Physik>
QuickSort
Computeranimation
Eins
Internetworking
12:15
Bit
Inferenz <Künstliche Intelligenz>
Web log
Quader
Spieltheorie
Software
Strategisches Spiel
Aggregatzustand
Figurierte Zahl
Computeranimation
12:49
Web Site
Punkt
Figurierte Zahl
Computeranimation
13:28
Mittelwert
Spezialrechner
Distributionstheorie
Zufallszahlen
Histogramm
Momentenproblem
Stichprobennahme
Mittelwert
BootstrapAggregation
Randomisierung
Graphentheorie
Computeranimation
14:00
Mapping <Computergraphik>
Programmiergerät
Menge
Mittelwert
Reihe
Ruhmasse
Plot <Graphische Darstellung>
Figurierte Zahl
Computeranimation
Eins
14:44
Total <Mathematik>
Physikalischer Effekt
Pufferüberlauf
Mathematisierung
Mathematisierung
Zahlenbereich
QuickSort
Computeranimation
Arithmetisches Mittel
Erwartungswert
Menge
Pufferüberlauf
Mittelwert
Dreiecksfreier Graph
LikelihoodFunktion
Brennen <Datenverarbeitung>
Gerade
16:08
Inferenz <Künstliche Intelligenz>
LikelihoodFunktion
Vorlesung/Konferenz
Computeranimation
16:39
Web Site
Datenfeld
Menge
Quader
Wahrscheinlichkeitsrechnung
Inferenz <Künstliche Intelligenz>
Mereologie
Simulation
Computeranimation
17:11
Parametersystem
Punkt
Minimierung
Güte der Anpassung
Globale Optimierung
Computer
Physikalisches System
Dreieck
Computeranimation
Mapping <Computergraphik>
Quadratzahl
Dreieck
Simulation
Quadratzahl
17:53
Lineares Funktional
Punkt
HausdorffDimension
Familie <Mathematik>
Globale Optimierung
Dreieck
QuickSort
Computeranimation
Arithmetisches Mittel
Zufallszahlen
Flächeninhalt
Randomisierung
Dreieck
Quadratzahl
Auswahlaxiom
18:31
Distributionstheorie
Zufallszahlen
Punkt
Flächeninhalt
Mittelwert
Minimum
Dreieck
Computeranimation
Eins
19:03
Schätzwert
Distributionstheorie
Punkt
COM
Stichprobenumfang
Zahlenbereich
Plot <Graphische Darstellung>
Dreieck
Computeranimation
Dichte <Physik>
19:37
Distributionstheorie
Rechter Winkel
Plot <Graphische Darstellung>
Kartesische Koordinaten
Figurierte Zahl
QuickSort
Dreieck
Computeranimation
20:13
Algorithmus
Distributionstheorie
Distributionstheorie
Stichprobennahme
Wellenlehre
Stichprobe
Globale Optimierung
Mathematisierung
Ähnlichkeitsgeometrie
Dreieck
QuickSort
Computeranimation
Histogramm
Flächeninhalt
Stichprobenumfang
Schlussfolgern
Flächeninhalt
Punkt
Simulation
20:46
Distributionstheorie
Inferenz <Künstliche Intelligenz>
Wellenlehre
Natürliche Zahl
Term
Ähnlichkeitsgeometrie
Whiteboard
Computeranimation
Data Mining
Histogramm
Umwandlungsenthalpie
Algorithmus
Datenmanagement
Spieltheorie
Mittelwert
Stichprobenumfang
Schlussfolgern
Punkt
Flächeninhalt
Umwandlungsenthalpie
Parametersystem
Distributionstheorie
Algorithmus
Stichprobennahme
Stichprobe
Globale Optimierung
Optimierungsproblem
QuickSort
Flächeninhalt
Simulation
21:40
Distributionstheorie
Momentenproblem
Wellenlehre
Stichprobennahme
Stichprobe
Globale Optimierung
Physikalisches System
LieGruppe
Dreieck
QuickSort
Computeranimation
Histogramm
Bildschirmmaske
Umwandlungsenthalpie
Stichprobenumfang
Projektive Ebene
Flächeninhalt
Punkt
22:22
Distributionstheorie
Hydrostatik
Bit
Profil <Aerodynamik>
Term
Computeranimation
Benutzerprofil
Prozess <Informatik>
Distributionenraum
Tensor
Stichprobenumfang
Randomisierung
Eins
MetaTag
23:01
Last
Skript <Programm>
Profil <Aerodynamik>
Vorlesung/Konferenz
Wort <Informatik>
Übergang
23:38
Dienst <Informatik>
Datensatz
Anwendungsspezifischer Prozessor
Programmbibliothek
Vorlesung/Konferenz
24:11
Punkt
Diskretes System
Stichprobe
Sprachsynthese
Twitter <Softwareplattform>
Kombinatorische Gruppentheorie
Binder <Informatik>
WebSeite
Computeranimation
Roboter
Einheit <Mathematik>
Mereologie
Programmbibliothek
Server
Mehrrechnersystem
Programmbibliothek
Repository <Informatik>
Hilfesystem
25:02
Folge <Mathematik>
Informationsmodellierung
Folge <Mathematik>
TokenRing
Stichprobe
Datenmodell
Inverser Limes
Wort <Informatik>
TokenRing
Ganze Funktion
Term
Computeranimation
25:43
Diskrete Wahrscheinlichkeitsverteilung
Distributionstheorie
Informationsmodellierung
TokenRing
Verkettung <Informatik>
Stichprobenumfang
Stichprobe
Datenmodell
Vorlesung/Konferenz
Computeranimation
26:19
Distributionstheorie
Dicke
Gewicht <Mathematik>
Wellenlehre
Machsches Prinzip
Relativitätstheorie
Schlussregel
TokenRing
Steuerwerk
QuickSort
Computeranimation
Richtung
Demoszene <Programmierung>
Informationsmodellierung
Verkettung <Informatik>
Stichprobenumfang
Minimum
Minimum
Modelltheorie
Steuerwerk
Metropolitan area network
28:35
Heuristik
Folge <Mathematik>
POKE
Rahmenproblem
Wasserdampftafel
Formale Sprache
EMail
Steuerwerk
Computeranimation
Überlagerung <Mathematik>
Netzwerktopologie
Graph
Informationsmodellierung
DomainName
Wahrscheinlichkeitsrechnung
Gruppe <Mathematik>
Stichprobenumfang
Mapping <Computergraphik>
Programmbibliothek
Algorithmische Lerntheorie
Teilbarkeit
Gruppe <Mathematik>
Graph
Machsches Prinzip
Datenmodell
Stichprobe
TokenRing
Kontextbezogenes System
QuickSort
Teilbarkeit
Mapping <Computergraphik>
Konstante
Generator <Informatik>
Offene Menge
Wort <Informatik>
Modelltheorie
Brennen <Datenverarbeitung>
Instantiierung
32:12
Algorithmus
Distributionstheorie
Domain <Netzwerk>
Gewicht <Mathematik>
SchreibLeseKopf
Stichprobe
Datenmodell
Biprodukt
Fokalpunkt
Ähnlichkeitsgeometrie
QuickSort
Computeranimation
Spezialrechner
Open Source
DomainName
Generator <Informatik>
Informationsmodellierung
Algorithmus
Offene Menge
Fokalpunkt
Modelltheorie
Neuronales Netz
32:47
Subtraktion
Wellenpaket
Rahmenproblem
Graph
Kategorie <Mathematik>
Datenmodell
Glättung
Ausgleichsrechnung
Ähnlichkeitsgeometrie
Code
QuickSort
Teilbarkeit
Computeranimation
Spezialrechner
Open Source
Informationsmodellierung
Verkettung <Informatik>
Gruppe <Mathematik>
NotebookComputer
QuickSort
Charakteristisches Polynom
Regulator <Mathematik>
Fitnessfunktion
33:41
Arithmetisches Mittel
Informationsmodellierung
Gruppe <Mathematik>
Gebäude <Mathematik>
Datenmodell
TokenRing
Ausgleichsrechnung
Bildgebendes Verfahren
Gerade
QuickSort
Computeranimation
34:35
Stichprobennahme
Formale Sprache
Physikalisches System
Äquivalenzklasse
Code
Computeranimation
Menge
Formale Sprache
Stichprobenumfang
Programmbibliothek
Vorlesung/Konferenz
Generator <Informatik>
Ordnung <Mathematik>
Ganze Funktion
Gerade
Auswahlaxiom
35:24
Spieltheorie
Perspektive
Formale Sprache
Vorlesung/Konferenz
Programmierumgebung
Computeranimation
36:01
Folge <Mathematik>
Bit
Magnettrommelspeicher
Gewicht <Mathematik>
Punkt
Datennetz
TokenRing
MailingListe
Komplex <Algebra>
QuickSort
Ereignishorizont
Whiteboard
Weg <Topologie>
Informationsmodellierung
Zeitreihenanalyse
Verbandstheorie
Gruppe <Mathematik>
Mereologie
Mustersprache
Programmbibliothek
Projektive Ebene
Optimierung
Korrelationsfunktion
38:01
HausdorffDimension
Dreieck
38:49
Algorithmus
Datenfeld
HausdorffDimension
Minimierung
Randomisierter Algorithmus
Äußere Algebra eines Moduls
Vorlesung/Konferenz
GreedyAlgorithmus
HillDifferentialgleichung
Computeranimation
39:24
Impuls
Generator <Informatik>
Folge <Mathematik>
Gruppe <Mathematik>
Rechter Winkel
Datenmodell
Vorlesung/Konferenz
TokenRing
Extreme programming
Computeranimation
40:08
Bit
Folge <Mathematik>
Gewicht <Mathematik>
Punkt
Gruppe <Mathematik>
Stichprobenumfang
Datenmodell
Gruppenkeim
Bitrate
QuickSort
Computeranimation
Aggregatzustand
41:11
Ereignishorizont
Computeranimation
Übergang
Metadaten
Formale Metadaten
Titel  The Joy of Simulation: for Fun and Profit 
Serientitel  EuroPython 2016 
Teil  40 
Anzahl der Teile  169 
Autor 
Warmerdam, Vincent

Lizenz 
CCNamensnennung  keine kommerzielle Nutzung  Weitergabe unter gleichen Bedingungen 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nichtkommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben 
DOI  10.5446/21241 
Herausgeber  EuroPython 
Erscheinungsjahr  2016 
Sprache  Englisch 
Inhaltliche Metadaten
Fachgebiet  Informatik 
Abstract  Vincent Warmerdam  The Joy of Simulation: for Fun and Profit In this talk discusses some joyful exercises in simulation. I'll demonstrate it's usefulness but moreover I'll discuss the sheer joy. I'll discuss how to generate song lyrics, I'll discuss how to get better at casino games, how to avoid math, how to play monopoly or even how to invest in lego minifigures. No maths required; just a random number generator.  In this talk discusses some joyful exercises in simulation. I'll demonstrate it's usefulness but moreover I'll discuss the sheer joy you can experience. I'll go over the following points (the short list):  I'll show how you can avoid math by simulating; I'll calculate the probability that two people in the live room have the same birthday.  I'll show how simulation can help you get better at many games. I'll start with simple card games and with the game of roulette. Most prominently I'll discuss how to determine the value of buying an asset in the game of monopoly.  I'll demonstrate how you can simulate Red Hot Chilli Pepper lyrics. Or any other band. Or legalese.  I'll demonstrate the results of a scraping exercise which helped me to determine the value of investing in Lego Minifigures. Depending on the level of the audience I might also discuss how biased simulation can help you solve optimisation problems or even introduce bayesian statistics via sampling. I'll gladly leave this decision to the EuroPython committee. 