## The Joy of Simulation: for Fun and Profit

Video in TIB AV-Portal: The Joy of Simulation: for Fun and Profit

 Title The Joy of Simulation: for Fun and Profit Title of Series EuroPython 2016 Part Number 40 Number of Parts 169 Author License CC Attribution - NonCommercial - ShareAlike 3.0 Unported:You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license. Identifiers 10.5446/21241 (DOI) Publisher Release Date 2016 Language English

 Subject Area Computer Science Abstract Vincent Warmerdam - The Joy of Simulation: for Fun and Profit In this talk discusses some joyful exercises in simulation. I'll demonstrate it's usefulness but moreover I'll discuss the sheer joy. I'll discuss how to generate song lyrics, I'll discuss how to get better at casino games, how to avoid math, how to play monopoly or even how to invest in lego minifigures. No maths required; just a random number generator. ----- In this talk discusses some joyful exercises in simulation. I'll demonstrate it's usefulness but moreover I'll discuss the sheer joy you can experience. I'll go over the following points (the short list): - I'll show how you can avoid math by simulating; I'll calculate the probability that two people in the live room have the same birthday. - I'll show how simulation can help you get better at many games. I'll start with simple card games and with the game of roulette. Most prominently I'll discuss how to determine the value of buying an asset in the game of monopoly. - I'll demonstrate how you can simulate Red Hot Chilli Pepper lyrics. Or any other band. Or legalese. - I'll demonstrate the results of a scraping exercise which helped me to determine the value of investing in Lego Minifigures. Depending on the level of the audience I might also discuss how biased simulation can help you solve optimisation problems or even introduce bayesian statistics via sampling. I'll gladly leave this decision to the EuroPython committee.
Laptop Simulation Theory of relativity Sampling (statistics) Maxima and minima Bit Computer programming Inference Category of being POKE Computer animation Core dump Figurate number Mathematical optimization
Turing test Randomization Turing test View (database) Inverse element Bit Inverse element Neuroinformatik Order (biology) Computer animation Lecture/Conference Blog Formal grammar Website Software testing
Metropolitan area network Turing test Random number generation Line (geometry) Mobile Web Inverse element Port scanner Food energy Entropy Number 2 (number) Subject indexing Computer animation Lecture/Conference Software testing Key (cryptography)
Web page Turing test Asynchronous Transfer Mode Histogram Randomization Greatest element Random number generation 1 (number) Real-time operating system Inverse element Counting Discrete element method Area Neuroinformatik Data model Inference Robotics Data acquisition Pattern language Musical ensemble output Predictability Metropolitan area network Turing test Electric generator Mapping Real number Code Inverse element Bit Port scanner Density of states Markov chain Computer animation Endliche Modelltheorie Software testing Pattern language
Point (geometry) Randomization Electronic program guide 1 (number) Counting Host Identity Protocol Variance Neuroinformatik Mach's principle Estimator Mathematics Form (programming) Predictability Computer icon Algorithm Turing test Forcing (mathematics) Sampling (statistics) Inverse element Bit Probability theory Markov chain Computer animation Software testing Pattern language
Process (computing) Sample (statistics) Computer animation Mapping Mathematical singularity Sampling (statistics) Computer Bit Entropy Neuroinformatik Task (computing)
Probability distribution Histogram Digital electronics Characteristic polynomial Knot Set (mathematics) Perspective (visual) Likelihood function Proper map Event horizon Rule of inference Number Inference Mathematics Negative number Physical system Mapping Sampling (statistics) Ext functor Port scanner Sample (statistics) Event horizon Computer animation Sampling (music) Physical system
Direction (geometry) Point (geometry) Sampling (statistics) Knot Time series Bit Inference Mathematics Computer animation Blog Blog Quicksort Task (computing) Library (computing)
Inference Service (economics) Computer animation Blog Point (geometry) Knot Bit Game theory Bit Task (computing)
Tesselation Multiplication sign Characteristic polynomial Plastikkarte Bit Shape (magazine) Rule of inference Likelihood function Web 2.0 Type theory Mathematics Computer animation Energy level Game theory Whiteboard Hydraulic jump Oracle Physical system
Point (geometry) Computer animation State diagram Tesselation Model checking Multiplication sign Formal grammar Likelihood function Arm Number
Randomization Computer animation Line (geometry) Workstation <Musikinstrument> State of matter Data mining Bit Whiteboard Table (information) Likelihood function Spacetime
Point (geometry) Curve Sparse matrix Computer animation Computer file Set (mathematics) Pattern language Game theory Quicksort Line (geometry) Mereology
Point (geometry) Mechanism design Computer animation Internetworking Real number Blog Structural load Physical law 1 (number) Game theory Quicksort Physical system
Inference Computer animation Software Strategy game Line (geometry) Blog State of matter Cuboid Bit Game theory Figurate number
Point (geometry) Computer animation Website Figurate number
Graph theory Histogram Randomization Bootstrap aggregating Computer animation Distribution (mathematics) Average Computer-generated imagery Moment (mathematics) Average
Programmer (hardware) Computer animation Mapping Average Plotter Multiplication sign 1 (number) Set (mathematics) Series (mathematics) Figurate number Mass Port scanner
Multiplication sign Set (mathematics) Total S.A. Line (geometry) Number Expected value Optical disc drive Mathematics Arithmetic mean Computer animation Causality Average Moving average Cycle (graph theory) Quicksort Buffer overflow Buffer overflow
Inference Computer animation Lecture/Conference Multiplication sign Likelihood function
Inference Simulation Computer animation Personal digital assistant Cuboid Set (mathematics) Website Mereology Field (computer science) Probability theory
Point (geometry) Simulation Mapping Parameter (computer programming) Port scanner Value-added network Neuroinformatik Goodness of fit Computer animation Personal digital assistant Square number Triangle Square number Mathematical optimization Physical system
Point (geometry) Axiom of choice Area Randomization Functional (mathematics) Port scanner Dimensional analysis Arithmetic mean Computer animation Triangle Quicksort Family Square number
Area Point (geometry) Greatest element Distribution (mathematics) Computer animation Average Triangle 1 (number)
Point (geometry) Metropolitan area network Distribution (mathematics) Estimator Population density Computer animation Plotter Sampling (statistics) Triangle Number
Distribution (mathematics) Computer animation Plotter Triangle Right angle Figurate number Quicksort Cartesian coordinate system
Area Histogram Distribution (mathematics) Distribution (mathematics) Point (geometry) Sampling (statistics) Port scanner Arm Area Sample (statistics) Computer animation Triangle Ideal (ethics) Quicksort Form (programming)
Distribution (mathematics) Parameter (computer programming) Area Inference Latent heat Natural number Average Term (mathematics) Area Histogram Algorithm Distribution (mathematics) Optimization problem Point (geometry) Sampling (statistics) Port scanner Data mining Data management Sample (statistics) Computer animation Sampling (music) Ideal (ethics) Whiteboard Quicksort Game theory Form (programming)
Histogram Distribution (mathematics) Point (geometry) Projective plane Moment (mathematics) Sampling (statistics) Area Sample (statistics) Computer animation Personal digital assistant Sampling (music) Triangle Quicksort Physical system Form (programming)
Randomization Distribution (mathematics) Multiplication sign Sampling (statistics) Ext functor Bit Bit Trigonometric functions User profile Process (computing) Computer animation Term (mathematics) Personal digital assistant Profil (magazine)
Scripting language Word Lecture/Conference Profil (magazine) Structural load Energy level
Service (economics) Lecture/Conference Protein Row (database) Library (computing)
Web page Point (geometry) Robot Service (economics) Presentation of a group Dynamical system Server (computing) Link (knot theory) Online help Port scanner Mereology Sturm's theorem POKE Computer animation Software repository Robotics Speech synthesis Library (computing)
Trail Token ring Point (geometry) Token ring Term (mathematics) Limit (category theory) Sequence Entire function Value-added network Word Data model POKE Word Sample (statistics) Computer animation Endliche Modelltheorie
Probability distribution Word Data model Distribution (mathematics) Sample (statistics) Trail Computer animation Lecture/Conference Chain Token ring Sampling (statistics) Endliche Modelltheorie
Asynchronous Transfer Mode Greatest element Token ring Length Direction (geometry) Knot Rule of inference Mach's principle POKE Endliche Modelltheorie Metropolitan area network Beat (acoustics) Distribution (mathematics) Theory of relativity Weight Sampling (statistics) Special unitary group Binary file Greatest element Demoscene Wave Computer animation Pi Chain Endliche Modelltheorie Quicksort
Logical constant Asynchronous Transfer Mode Context awareness Divisor Token ring Graph (mathematics) Multiplication sign Mathematical singularity Water vapor Open set Formal language Data model Optical disc drive POKE Machine learning Musical ensemble Endliche Modelltheorie Covering space Domain name Electric generator Graph (mathematics) Mapping Sampling (statistics) Instance (computer science) Sequence Rothe-Verfahren Frame problem Probability theory Word Computer animation Network topology Different (Kate Ryan album) Endliche Modelltheorie Quicksort Musical ensemble Library (computing)
Domain name Focus (optics) Distribution (mathematics) Algorithm Electric generator Weight Computer-generated imagery Artificial neural network Open set Product (business) Data model Computer animation Endliche Modelltheorie Endliche Modelltheorie Quicksort
Laptop Asynchronous Transfer Mode Graph (mathematics) Divisor Regulator gene Code Computer-generated imagery Characteristic polynomial Fitness function Frame problem Wave packet Data model Category of being Computer animation Different (Kate Ryan album) Chain Endliche Modelltheorie Musical ensemble Quicksort
Data model Medical imaging Asynchronous Transfer Mode Arithmetic mean Building Computer animation Token ring Endliche Modelltheorie Quicksort Line (geometry)
Axiom of choice Code Sampling (statistics) Set (mathematics) Line (geometry) Port scanner Equivalence relation Entire function Formal language Computer animation Lecture/Conference Order (biology) Electric generator Library (computing) Physical system
Computer animation Integrated development environment Lecture/Conference Game theory Perspective (visual) Formal language
Point (geometry) Complex (psychology) Trail Token ring Weight Projective plane Electronic mailing list Time series Bit Lattice (order) Mereology Event horizon Sequence Computer programming Cross-correlation Software Personal digital assistant Drum memory Pattern language Quicksort Whiteboard Musical ensemble Endliche Modelltheorie Library (computing)
Triangle Dimensional analysis
Greedy algorithm Algorithm Exterior algebra Computer animation Lecture/Conference Personal digital assistant Randomized algorithm Hill differential equation Dimensional analysis Field (computer science) Mathematical optimization
Data model Asynchronous Transfer Mode Electric generator Momentum Computer animation Lecture/Conference Personal digital assistant Token ring Right angle Extreme programming Sequence
Point (geometry) Data model Asynchronous Transfer Mode Group action Computer animation Bit rate State of matter Weight Sampling (statistics) Bit Quicksort Sequence
Red Hat Computer animation Energy level Event horizon
check by Vincent when so everyone
thanks for having me over the US
legal status some about um that act against their about the joy of simulation homonyms Vincent come from Amsterdam over going to a new beginning there is a need to get together to laptops and others 1 website while you're doing
that Morgan discuss various programs is and what isn't that I'm going to explain you guys have sampling can actually used to you know do a bit of inference this nice I will then demonstrated couple experiments them with sampling and then explain to you how I derive some better tactics for monopoly using simulation I will explain to you how I found out that you can sell it goes on and that like many figures in the very core of the property that I could go on and talk about how sampling can be used as optimization tactics and I'll conclude by talking about how we can outsource creativity that by using sampling and then I'll talk about Symposium on related subjects
which somehow blends everything together so grammars before talking about what randomness this is going to be
sampling we should be sure that we understand what grammis this isn't because you know we're humans and
computers nowadays 10 view of a better understanding members and we so it's a good old please go to the website that I
just told you about ever going to go ahead and do a bit of an inverse Turing test so
this is the the website this to be my blogs and then this is 1 of a blog post called human entropy please go there right now agency website that's somewhat similar to this
and you could read it but the idea is we're gonna go ahead and try to generate random numbers so put your index finger on 1 of the index finger on 0 or use these 2 buttons you will notice that if you click the number will increase I just go ahead and do this and I'm just generate a bunch of random numbers and try to generate them as randomly as you can but let's generate about 100 and I'm going to go out and generate a few more energy than there is a couple more seconds in the experiment itself so
I almost got about 200 members we consider JavaScript's slowing down from the top to that 1 so it is generated by 2 numbers ideas you and said
OK this 0 0 1 1 1 0 0 0 1 1 1 0 let's see what I've if I scroll down what I will then C is also all these histograms of how often I think the 1 and how often a bit 0 but also often I picked a 0 after 0 and 0 after 1 etc. etc. etc. and what you know this is even on trying to become kind of random I'm trying to make as many ones as I am trying to make as many zeros that you will notice that I usually fall into this pattern where do 1 after 0 1 0 or 1 and is very normal as a human being to that feels random even though it totally is and this small whether you know you can read of the map you want it tries to do a real time prediction of what you're going to type in next and you can also track how often with the accuracies of just below the page that would be more of a real-time
thing so 0 0 1 on your original you that that the the mother animal and uh then you at the bottom you can see the probability of me being the humanists of me being a robot is quite low so you know I like this idea of human inference gets in the way this is an inverse Turing test by taking if you can actually generate random numbers I'm actually elicited you're definitely not a computer and so this is a
useful way I hopefully will quickly explain the guide how randomness works and how it doesn't work
and it is useful to have some form of randomness available to you but you as a human simply uninterpreted generating it therefore we're going to use a computer instead for the whole thing in the
playing with this to 0 0 1 this really 1 after the other you see
that there is this estimator tries to predict and what I'm going to generate next and you will see that at some point a probability of me getting in operating 1 this no forces someone and 0 probably switch but if I now switch my algorithms are moving and generate zeros now you will notice that there's a bit of confusion but after a while it has picked up my new pattern and the accuracy goes up again the only ones now you will be displayed with the same pattern the cool thing about this is because I have these laws of probabilities my disposal or use a little bit of math all these predictions but still useful to have a random sample so
I have a small demonstration hopefully it's obvious that you mentor be generally is quite terrible is where prefer use computers to help us think about probability and because we have a circular available to us because we can sample quite fast and we can kind of avoid doing a little bit of map and this you know matters
hard and even though but it is very useful we do like just get the job done so the goal of this talk is to convince you that you can do a lot of these tasks just by getting um sample and I guess the
easiest way to explain so the from all in perspective samples are useful and sometimes we can show we know the characteristics of a system that we want to know the likelihood of a certain event happens and again might be easier to use sampling set of maps the inference for us and simplest example I could come up with this suppose you have a lot of dice you roll the dice and then there is a probability that a certain number of ice proper and I could calculate that I mean you know this log probability
is applicable to just go and do is I can draw are uh with . histograms and this is the histogram about 1 guy is this histogram provide 2 guys this is required for you guys etc. and circuit is nice probability distribution of that I don't have to do any math for negative sample the nice thing about this is I can ask this thing to questions suppose a for guys what is the probability of getting a certain number of eyes some like also ask a different question I can also ask given this number of ice how likely is it that I have been rolling a dice I know the rules of the system I can describe them from
there like sample that means that I don't have to do math but I can still do inference on the so it's sort of like I look at it from this direction but most look at from this direction this is a powerful thing it is something of agents and the like a lot
introducing this kind sampling by the way uh please consider looking at this library called finds the 3 quarters other library called the and see the sampling methods for inference of very powerful so to theoretical for this topic the interested there's a very nice tutorial my blog which explains how you can use some sort of time series analysis with the sampling techniques as well as
service doing inference and dies particularly inference and but anyway
and let's consider a fun example of how actually got better at doing something because I had this computer available like due to the fact that do we know this game
yes do we also always play game during Christmas for you as a from my dad always makes me play this game during christmas and I absolutely despise again I don't see any joint whatsoever so I figured you know how about it might be fun I enjoyed playing the game I could at least enjoy being my dad so the idea would be can sampling a little bit to get better
at this game because if I think about it uh every tile on this board is worth something and I can calculate the expected value with only new with probability was that it actually such a time OK so math wise this would be a little bit hard to shift the foremost oracles of just do what I can say OK let's just for 10 thousand times I know the rules of the system and the rules of the game together from the Web was set 1 start here and just start rolling dice and use the rules of the game this shape for the long jump over the years as a lower level of certain types is a very interesting characteristic of this game because the likelihood of being around this corner of the board is quite high Because his go to jail so that is
in a simple way than including the cards but I didn't know what this
grammar that looks like this so at the x-axis you'll see I to the number of the tiles is at the time
of the 0 this is time number 3 9 and you can see the likely you
landing somewhere and you'll notice is all the spike in which coincides with the jails of the year 1 of the
most interesting thing that you'll notice is after jail that seems to be a slightly higher likelihood to be at 2 steps away from jail 4 steps away from jail 6 8 10 and 12 steps away from you and the reason for that is because it's unlikely that you get in dual you wanna get out it's really after all the same guys points this why is more likely to land 1 of these areas and if you know that beforehand you can change a tactical of for example and
I know there's a lot of randomness in the game but if I were to choose a station on this board it becomes it seems more relevant that I would take this station and that station and I were given the choice and
now I can calculate how much likelihood is directly land there right in the 1st station this is the 2nd 1 is the 3rd 1 you want 1 and you can see that is actually it's not think twice as likely but there's a bit of um by experimenting in have potentially use and again this
is something that for the spaces is stationed generates amount of revenue but you can scrape monopoly of comments on that and actually get the amount of money out and that you can get if you land on 1 of these places and this is an ice table which is sort of cool but and obviously you just go and plot this so for every tology combined you'll see
at points listed here uh this is the probability that you will land on sets file and is the rancid you can charge of someone lands on the pilots the size the point would be the expected value so the point is very big that means on average and after the game that the trial will generate more revenue from here and I think that you need we notice is there seem to be sort of an efficient ISO curve that everything on this imaginary line seems to be worthwhile called down at the bottom there's a couple of you know not really performing all sparsity combined can anyone guess which part
of problem that these problem of these that would be added to big dust so those would be these 2 Bosnia landing they're quite small but if you actually land boy are you in trouble and so the risk by its kind of thing it's very risky which approach has a lower probability of landing here soon because the go to thing happens for and economic drawing with the with the point but you see this pattern also if you bought a house there about a 2nd so I'm not
saying that I actually got much better at playing this game and the but uh I don't understand the game a whole lot better and I did didn't really measure how often I want and but I like the fact that just by using a load of sampling actually understand the game all of that this I know the mechanics of the system but then I I can know collect data by sampling and suddenly I understand the game a lot but it turns out this is also on the blog if you're interested in this blog posts during I can use for a while and the kind people on the Internet and point out all the weaknesses of and there was 1 guy and that's sort of the floor at your modelling always I think this is all all start with the obviously it doesn't encompassed everything in the game so 1 person in particular was very adamant and said look if you wanna when you should buy these places these are the ones you
want to have and the reason is there's this mechanic in the game that if you buy all house 1st no 1 else can buy houses to you're the person actually owns all the houses here no 1 can invest in houses on many of these laws and then suddenly do become more valuable and in week we could go a little further on we could go and you know going depth and how that would work and a colleague of
mine actually builds a bit of software around
this and you can send your own genetic box to play this game with a
certain strategy that talk to me on about this topic and this was definitely that I thought was fun and I understood the gain better turn out to be a nice blog posts and was a nice example of when sampling of something that's yeah but it also
this is where the crazy example but there's also some places where in practice I wanna make a living and I might actually make more money if
I just do the inference by sampling and the best example that I have bodies legal many figures
that are very familiar with Lego many figures a show of hands on the great so it was a smart company uh Legos at some point realize like a if we combine star wars and Legos they've got to collectors items in 1 and there will be more people will be willing to buy so collectors items you can make a lot of money and that is the original Lego
many the disease and like I mean think is a kind of like the king the boy no you open up the packet but you don't know which little many figures in there but there 16 and sets and after 2 months of so they never going to produce that again I OK that sounds interesting and would there be a market for this so we go to this 2nd hand website
newspaper little data and you make a small little histogram you wanna try to see if you supposedly we invest in of many things at the moment it will be profitable suddenly this is again you get out which isn't that pretty so I don't really have an impression of what the
average price might be there seem to be a lot around here but there's a thick tale of there as well so that we get a good impression what the average might be so although the bootstrapping say that I have 60 these prices and it's going to graph theory of random and calculate the average repeated over and over
and over again more smooth curves like in visually interpret plot with that and turns out if I this is for the Simpsons there many figures that I was looking at these are for the other ones at the time the centers like many Figure 1 was the most recent 1 and I can imagine that as the series gets older it'll be worth more so that this might be a good habit to look at and you should always look at averages but this seems reliable enough so these are the figures
that I can buy a little mini figure for 3 years of peace and I can still awful sense for 100 euros later how likely is it to get the full set ignorant I figured I do this with mass and when I say I am going to use of matter the programmers where map so you go to
math overflow and said the question and they get an answer back what is even more complicated turns out that this thing called Sterling said numbers which is no insurance in this in theory and solve the problem so instead of looking at math or even cycle for the comes to the rescue I should probably just in our simulated so it
it you see the number of packets uh that I would buy in the line itself shows you know the probability of getting a full set but the bell that line shows the expected number of total sets that I would have but obviously if I buy 100 packets you the odds of getting at least 1 full satisfied great but I might actually 2 sets so rather likely and again whenever doing these sorts of things is always nice to visualize once in a while cause if you visualize something they can get surprised is that when I was looking at this and it sort of made wonder G. if I have 1 said and I start clicking even more from the 1st set I collected probably have some spare legal meaning which I could probably used to make sure that the 2nd set that by is actually easier to collect so I think that thought that I similar is over and over and over again and the blue line you see here is again the number of packets there's been a number of sets and I would have this shows the average amount of time that you might need to get 1 that theoretically you need
about 50 genes alright but the time it takes here is much more than the time it takes
here which in turn is more than the time it takes here which in turn more than 1 this here so the moral like I the higher the likelihood is a lot more sense and again uh this seems intuitive but it's because I've done the inference that was able to figure this out so again sampling is very useful and in Amsterdam often
give this course in probability theory so this seemed like a nice example and and the given this cost you a bunch of bankers and in the latter part of the afternoon we have we opened 1 of these boxes to see the inference was correct and it turns out if you open the box up and you will always
have 3 sets even randomly distributed was still from quality sites so that means that the
obvious use case we think of simulation use simulation because 1 of the probability theory but it's actually other fields and just probability theory that can now nice benefit from
doing a good simulation exercise so this
talk about more general use cases let's talk about optimization in general and give the idea of how it works and going to give a slightly
silly example so this is an example where we know the correct answer beforehand but suppose we have a 1 by 1 square and we want to find the largest triangle in this 1 by 1 squared and again this is a very silly example we know what the largest triangle is just now making actual you're done but let's say this is a system that we want to optimize the computer has no notion of what the best parameters are registered in finding 3 points and the area in between these 3 point has to be the biggest as only thing that's known to be computed this nominee map so what's
a person can do well how about we just you know generate a whole bunch of random triangles and pick the best 1 see how good that just
wandering in from the cold that sort of means you just generate a whole bunch of random values in 6 dimensions x 1 x 2 x 3 y 1 y 2 y 3 those are the 3 points and then this function called a choice function which then calculates what the area between all these points and you do that for a whole bunch of things brother itself and you know the largest triangle the 0 . 5 and the probability of actually sampling a
triangle from this region is rather low and this doesn't surprise so what can we possibly do well
what I could do I could say gee um I have an area and I have all of these x coordinates how about I throw away all the bad triangles that but I can I can take the average of all the triangle sizes and is there a way that the happens at the bottom I could keep the ones that are employed and instead of looking at the area how held the look at the distribution of the points there
is a distribution there and as luck would have it and if you have a sample size is rather big what you can do is you can give I could learn this I could learn a nice density estimator so that's what you visualized here this is the density plots for
the points performing well and to just sort of assuming that these
are this X 1 and X 2 and X 3 and it seems that if and when it gets to its triangle the idea that x 1 has to be alone number or a high number and if X 1 is the low number the next to has to be a alone number as well or a high number
this coincides with my belief would be triangle should be used this sort of nice but only if they do this not only can I maybe get a sampling technique to give me better triangles but I can understand the problem of the better as well just like a monopoly in just like the Lego example that is what the X and Y
distributions look like again the figure
plots this is just the x is 0 otherwise the z axis and the whites
together and when I see sort of makes sense I like it if my X is smaller than what was that makes complete sense of that right so I
have a distribution of sample from this 1 so the idea is on and
I've selected areas of big winner so and sample from that distribution instead this is what
we had before and if I do a new sample that I that I just learned all suffer from this distribution
and again would you will notice it's sort of like the year uh and then I'm gonna sample larger triangles so how about I just repeat this example of a sample of a sample of examples and the and the nice
thing is we can repeat the same idea some sort of convergence pops note that the mathematically
I've said they were going to select the areas in a larger than some some sample some M and you
as well this is triangles but I can sample anything it's continues on discrete is just a sampling exercise which means have a very flexible way of optimizing any system as not the same that I always find
the best solution is a proper way to do a form of search and
hopefully by now I've convinced you that sampling is indeed useful and it can be a little bit surprising and its use cases was going to talk about now is sort of my hobby project for the next
year I think and it's the thing I'm most interested in that moment and those generative methods and the thing I like about them is they should allow you to outsource creativity and entropy has a small role to play in this so so for this
reason the static and in the next little introduces the more Markovian way to think about randomness so far I've said this distribution given a sample in the sample and it wasn't really the case that the sample it I just got the terms a little bit in the next step
and this is literally what is listed all my LinkedIn profile and if you please spend a little time actually reading it some of you may understand the job it seems surprisingly relevant few people people with with what you
couple people understand the joke so if from and you wanna write down what what's the stuff you do so I got Popeye and script and so the key word being a little place and with this you know nice but to get these nasty recruiters on your profile which which tend not like so I
figured I'd be fun to just add a couple Pokémon in there as well so there is no load low-tax level as actually think and text gets are so again the main the main reason is I like to give you this on the mean promised in the that there's nothing more
fun recruiters says they would you come to work for my corporate bank and you can and say well you guys you spot it you and applied and and recruiters says yes obviously all signal to the recorder in question would be entirely wrong reddening occurs the terrible classifiers for protein 1
and recruiters can really extinguisher put my name presenting of technology so I figured making a Python library that can generate putting my name's might actually be fun and so the libraries called gravel it's totally not done yet but the idea is to have techniques for service and before you start thinking gee that's ridiculous Vincent
and speaking of Pokémon segmenters is again repo
turns out a lot of people have useful tumor names for their original get help projects and there's a link in the presentations if you look at the fine but the latter part is actually a promissory begin technology but it's a template-based robot dynamics library and there's this little web page that for every 750 Pokémon tells you link to get out of with no improvement package
practices the reason I wanna a global is I've always been a user of libraries that are really with myself from the problem seemed interesting enough uh almost like you learn from doing this so of how we would like to about some some called the Gnostic painters servers or something and but so the idea is to generate names that sound like poking and to put it bluntly the whole point of gravel
was to come up with name and then sad
thing about OK so I have this knowledge in a Pokémon entity with interesting problems um but it involves generating a believable sequence of tokens and you think about it I could do this a Pokémon names but there are many other things I could do it well like reduction couple limits or ikea furniture names or notes on the piano um so the simplest model that you could possibly think of say OK suppose I have some tell all this is independent of its a letter in the word or if it's a word in a sentence or of its amino in entire arrangement or if it's
um and you know it's the sound in a few furniture um but the idea is once I know the previous token I might have some other probability distribution for the next Markovian thinking is all about depending on the state of and now I will sample from a different distribution for my next token
and this is the simplest model um if I see the letter A. beforehand uh probably the chance of seeing that evolves the somewhat smaller as constant might be somewhat larger and but you can also do it just
for the last over the coming before the use of generating a Markov chain into gonna learn then you gonna
try what and the basic way of doing is Cervantes Markov chain it I can look forward maybe I look back at that as well so if i'm generating the 3rd tokenize should look at the 2nd token and the first one so it's a Markov chain of length 2 in sense um but if you really want to this is just a way of thinking about it uh and if I know t 1 and this solution for T set this of the sampling rule if you will but who says it has to be ordered in 1 direction I can also model this in a way that you can model it from 2 directions if you think about it the uh and it seems fine to start with the 1st letter of the Pokémon name because is like prior belief that you could start with a certain letter but is probably a different distribution for the last letter so it doesn't seem entirely and sensible that document probably don't end with a letter I for example so maybe we should have a Markov chain that goes 1 way for the Markov chain that goes both ways in it seems very sensible and sort of the same probabilistic model so I'm playing around the so does look like if you preview your sample came on and I came up with a real clean cool racist who really got you tail recently until all my elected as the scene rather possible gratitude that results this hilarious uh can you believe hold me please by the way I wonder what the wave man YTD screening and the there's been when I was fortunate I know you must've in fact this year in the sun and the bottom dollar foxhole of applying your house not the spin further like but you can do do this and this the sort of the the lovely joy of doing things like making these sorts of models isn't that this is accurate what is somehow actually again I somehow do
believe that this came from the culture that and let's say not or something which just sort of noticed is that the corporations in the corpus of relative ability somewhat limited so if I say by the way this obviously always gonna be together with the weight when I think of switching between 1 of the songs so the corpus does have an influence with this sort of works in this exciting sending for Ikea furniture this and I know what you guys but I would definitely like to have debate couch already Europes thing you go
also do is you try that judges um how about generate 100 samples on this 1 was based on this 1 model and that is the model that would judge this model to fund hundred samples I will sort them all on the take the top 10 also seems like an appropriate way of thinking about that because I think more this is the sort of hardcore but can also think about it not in a summer cold in way but they need to generate a factor graph instead this has some nice benefits and even we can even take these Levenstein each approach where I started with a Pokémon name and 1 by 1 randomly I change a letter also way to generate something it's the the sun possible and the nice thing about doing it this way is I not necessarily limited to just probability theory but I can use jensen's water that cover them here and then you can sort of wonder hey uh what a word is in the sentence there some context and what the letter and it's like a opening sequence most of the interesting way in which you have like talking to that because of severe these deep models which is fashionable nowadays and the way that workers you would say either starts tokens and absent prior beliefs to generate that then sample all the way down to a season should stop token that this is nice but it has some sort of have some features that I would like to have which is this way of thinking of necessarily support and suppose I wanna generate Pokémon name with 6 letters and I know the 2nd 1 and 6 1 pomegranate
generate the this is the so this is what makes the problem hard too much you're going to give support to this but if you are deep learning specialist I think you are you know the solution to this we can talk to you about here what you think about it so I'll focus on the following domains
for was a graphical models seem alright heuristic approach indeed learning seems right it's very interesting to see how the people converging is that nowadays you can use a neural net as a generative algorithm which takes some sort of Gaussian and and generate some sort of generative distribution out this from the opening I hate the approximately what we have research on product like this and the roughly the
planets so the main lesson ahead of designing this instead just writing
code in a notebook it seems like a better idea before you write any actual code um you should wonder what does the U I look like I'm going to be the user angrily using is a lot of different lexicons of the models of the train was the best way for me as a user to play around with this so didn't you just copy so I can learn um the there's some notion of a data frame and is scholar lexicon there are these different models to assess factor graph to a Markov chain etc. This is what defines the properties of the model regulator fit that on a separate lexicon and once those models are trained I can generate from them and the hopeful ideas I can somehow make the ensemble based on 2 different models maybe even given some waiting and then use and sort of the the outcome of this was some sort of judging characteristic of model simply writing this down for myself personally been working at this middle of clear for me to
know what I should building how should we design and hopefully made by the end
of next year high come back and talk about a model that can make Pokémon names where you say hey I don't know this token and another 1 and that 1 by 1 the tokens and with base especially in the market for making at the Technion very useful and another thing is in sense
of this is the sort of the deep neural meaning
dream of building something that works tokens for art as well
um this is actually something that doesn't use any entropy but this is something that you can generate fairly easily in just a couple of lines and JavaScript that you're interested in there is going to come on friday to my then the talk will make pretty image is sort of like this but in 3 D that the goal of the talk will just simple human if you interested in talking to build pretty stuff that looks like this so concluding
essentially can be a lot of fun and sometimes actually profitable Lego example Getting Started super easy manage be surprising how often it can help you out so again people don't necessarily always understand finds 3 because it's a little less Streep fought on the side which are very flexible you can describe the system a sample where you can actually solve a lot of problems python is a great use-case language for this it's actually surprisingly easy to be very flexible just a couple of lines of code um and also quite fast as 1 of the great and and it never considering thing about API is something equivalent from our community and try to optimize for choice you're going to be using the library maybe and like the it's
better to spend an entire day things be in the i would like to do you most want to start using set of having amounting to climb in order to come
from the language we use perspective just fall into the France was the idea that take any
questions but anything poking unrelated that's not about the game promoters talking environments and really since no
question born president that thing the
lecture on the on the on the also if you done something similar to this also come talk to me this is sort of the project but I'd be happy to hear someone other than to points in similar I thinking is that this beautiful little bit yesterday and I question is how much of the program question but if you had a simulated 10 network of cues hence the human you know and how would you do that so there's too few
selected each other citizen network can maybe this like yeah that so the use case for me is definitely a problem right because it during a meeting track portal talking and then this is the point of a little there main gist of what I'm trying to get to at some point is this is using the music of music would be more difficult because then let's say you have a base that's playing of drums that's playing and then you have the melody for example when you have 3 sequences actually have to correlate somehow um but in this case don't really fit into this model and what you could do then is uh there's a couple of libraries have a bit of support for this but you can have going to probabilistic graphical modeling in the weights and the trick is events say that's just to try to you this 1 trick we can say these are the 3 tokens that I can see as 1 separate token and that will go into this 1 lists and I'll try to generate that uh and this can be you might need a lot of data before you actually get the pattern right little expensive unless the other ways not having the complexity in the data but that's been put into the model where say these are 3 time series and somehow correlated learned that and you could do that with sampling approach by 3 are some examples of that then you get sort of correlated time series land and I defined like a whiteboard I get more easily explained it used come to me upwards of that something but it is sort of way part the
anyone else yes not because of the
presence and correctly from wrong but I believe the question is hey a this is a nice talk but um obviously when you have more dimensions to be harder right so for the triangle example is reasonably easy can only 6 dimensions and the right uh and so the dimensionality is always initially and believe of the graphic pictures so we suppose we had
this but in the 12 thousand dimensions obviously it's a whole lot harder and what I would be more to same year is considered as as a approach it actually has a couple of years we use cases and you can actually solve something that especially in optimization field where you have a lot of hills and many many
dimensions and the reason why we used genetic algorithms is best because the best thing we have the necessary because that's the best thing we could have when you go into random algorithms when you do that because uh and there's no
alternative this greedy methods for those of them yeah so I didn't
quite get was why couldn't do like analyst to so there's the thing with the STM is usually you can say hey generate lots lots of sequences that in issue I sort of have with this 1 use case so suppose that you say I wanna have a Pokémon starts with an H then the name
and then after 3 tokens comes the the ensuing place yes no and I don't want to finish
sequences but did you like just like you would want to portable so this of the Wilkinson problem with like judges just like just like a bunch of people who will nuisances welcomes of now and and and I have been the thing is um suppose I extremal momentum right so
but when I put my name learning Learning internal states that the usually I would generate a new sequence would be to say start go and then at some point in sample and something that is done that has a Pokémon knowledge as I could say generate all of them
until at some point you have a proton in the start of the H then a then freedom and whatever and then the but that feels like over sampling a bit of this and happened this is the 1st and so that this is more of an open problem this is very fun to good academic conferences and ask this to professors that and began this should be a sort of more generic so of 1 neural weight also approach this problem and this might be a generative thing from opening i we actually at the end of the yourself like worthwhile venture and this isn't never really heard anyone say here's an obvious solution to this problem is and isn't like if it's not the most important problem in the world by the rate of occurrence and so I'm not going to angry about the like it is given the group
also the level of of events of
hidden