Non Sequitur: An exploration of Python's random module

11 views

Citation of segment
Embed Code

 Title Non Sequitur: An exploration of Python's random module Title of Series EuroPython 2014 Part Number 109 Number of Parts 120 Author Trejo, Jair License CC Attribution 3.0 Unported:You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. DOI 10.5446/19984 Publisher EuroPython Release Date 2014 Language English Production Place Berlin

 Subject Area Computer Science Abstract Jair Trejo - Non Sequitur: An exploration of Python's random module An exploration of Python's random module for the curious programmer, this talk will give a little background in statistics and pseudorandom number generation, explain the properties of python's choice of pseudorandom generator and explore through visualizations the different distributions provided by the module. ----- # Audience Non mathematical people who wants a better understanding of Python's random module. # Objectives The audience will understand pseudorandom number generators, the properties of Python's Mersenne Twister and the differences and possible use cases between the distributions provided by the `random` module. # The talk I will start by talking about what randomness means and then about how we try to achieve it in computing through pseudorandom number generators (5 min.) I will give a brief overview of pseudorandom number generation techniques, show how their quality can be assessed and finally talk about Python's Mersenne Twister and why it is a fairly good choice. (10 min.) Finally I will talk about how from randomness we can build generators with interesting probability distributions. I'll compare through visualizations thos provided in Python's `random` module and show examples of when they can be useful in real-life. (10 min.) Keywords EuroPython Conference EP 2014 EuroPython 2014
Series
Annotations
Transcript
please welcome J is gonna tell that random module thank you non-sequitur an expression of fighters from but succeeded means it does not follow in Latin and actually this is name of talk because it's sort of describes the behavior of random sequences but also because my own interest in the topic and I was completely random and so if
it does not have much
practical relevance for me but still I think it's an interesting and beautiful topic in work talking about so I worker being Goffman's small development
shop in Mexico City and and I want to talk to you today about minimizing computers in the Python Standard Library so highlighting his work random because this
has its basically basic meanings of the unpredictability and personality idiosyncratic
annotation of 1st containing the of so this and in fact it
comes from Old French works that many things like speed of
violence a impulsiveness the Spanish word as I said that it comes from Arabic and 1st all dies so even now we called transients where this
had the mathematical term is very much related to to the gambling and meaning so it is not
coincidence that the 1st operations of probabilities In mathematical theory that measures analyzes and upon predicts random outcomes and has its roots on trying to understand gambling and what with listen into predicting deal from offending and 1 of the 1st examples of probabilistic analysis comes from a series of letters between Isaac Newton and some of that is the the president of Royal Society concerning some by and that some of it is was going to make that with the of the early in life as a process the animal the 1st thing that and we
hope that each of the 6 faces has an equal probability that come out when we're running as a before filling it with only know what this land government and even when we do a series of roles the information about the past outcomes of lies does not give us
any insight into what is going to come next the so if paradise and the number for comes out this for a
random number but it certainly isn't number chosen at random but just in looking at it we cannot know whether the process that produced it was actually a random so we can read randomness of individual numbers bad about
sequences of numbers and sequences of random numbers have many applications in real-world situations and the often used for reducing the size of a broadly based their by
sampling it at random points and this can be seen for instance in statistics where you take a
representative sample from publications like when you pick 2 people to call for my election polls or simulations word want approximated by the
sum of the probabilities of 4 fundamental properties and we can randomly generate events and then a statistical emission and the probability that we're looking for been relations a sequence of random numbers to be on a former distributed and I said this is
an even number in a certain range needs to come up with roughly equal probabilities for all the ways or associated intuitively at the same biases the 1st of this sequence and it's fair and reasonable uniform so they can consume the uses simulations of the source of numbers between 0 9 and in fact if we try and take the average the 1 of the the reasonably around for about 5 is what we will expect are for such a social
achievements and every number comes up roughly with the same frequency the but when the number of cells having more than that it been
that means security and music you communication
algorithms that use random numbers for pseudonymization and so that all interested parties will notice this is a random number active graphics and schemes was used random number for teenagers signatures in a way that doesn't really informational became even if you have a lot of Signal messages and phrases in general there is still a lot of random and secret key that you need to use for every website that is used to seeing their sessions and in them so that users for that actors the temporal with them and thus reversing the scandal the inner but can be kept up there in a random number generator used by error state
in many of the products so apparently they can predict which random numbers the arity approach became which has
a sense of security implications so this cryptographic applications require must of time more than just an unbiased sequence of numbers the required this offensive of numbers to be actually predicted
and decadent knows which random numbers have Peking or even hassle inside and all do what what to look for the random
numbers and then behave as are open to all of my secrets so this is work about default it looks over maybe but it reasonable that the 1st piece
of which of course complete the fix a
predictable sequence if another knew that I was using the from by for generating random numbers which you will only have to compute by and you will know all of my future begins so keeping in mind is
the requirements of impartiality and unpredictability that what can we use for getting pseudo sequences of random numbers 1 option is to use natural
phenomena that we know to be unpredictable when we show with sufficient accuracy instance elliptic random the this is that was very nice admission of food and this extracts random number from but in the UK there is a gene called but use transistor
measurements are the big winners in National lottery and we can also use radioactive isotopes the case which we know is the nature of predictable and
independently independently of the position of all instruments or in the quality of our models but but disk is having gone is that it
is often the slow expensive and require all of specialist
equipment to measure and this sort of natural quantities to generate random numbers so it may be useful to generate this that a
large part of random numbers once and then compiling this into a table of random numbers that we can draw from an in the future as a matter of fact in 1955 at the RAND Corporation which a table of a million brand and the it's obtained from a specialized hardware and this knowledge of came to be widely used in simulations for engineering and science bad of course not Latin America they will still have some disadvantages of that especially with the computer from back then it is very characters and efficiently and access such a letter to last A which led researchers looking techniques and for random number generation on the fly so of course computers are deterministic effects so the future receipt of the
matching is completely determined by the present state was so how can that actually generate random numbers the what do so that others we incorporate input from outside devices but we can only generate pseudo-random numbers that is
to random number generator all the numbers that look around 1 statistically measuring them at but they're not actually hard to predict if you have enough information about the state of the United Internet importance number Newman was doing simulation work the a of stream of random numbers it he came up with a generating it by
taking a random number of squaring it and then taking the form the middle leads to over this next deal with the generated solution of random but it is
going to be corrected for it because for instance if we get a serial somewhere in there that means that from then on the sequences the land of and and it also has a tendency to avoid the the lord of the performing dilution look and which of course there's no way to getting out of if I we can use the received and make sure this how long this
generator run before starting to repeat numbers and we can see
that it for 40 years the sequences are not very long but if we take 1 of those long sequences and check the average value and the statistical properties we can see that they looked at regional around so this is what you will generally randomness more precisely
and if you want a mathematically valid randomness with ways to formalize what its impartiality and its unpredictability
aspects and as as I went from images predictability is to look at the end of each of oral i which is a measure of the space of possibilities that can take and it's important to note that this cannot be immediately tell from looking at the numbers you have you have to actually look in the process that generated for instance here we see those numbers and I told you that may be Democrat and directing the they pick them for arise from several
differ from 0 to 100 but what if I told you that they are actually prime numbers then you will see that the action space from which was brother was much smaller than without so it's you that whole my
bank as be a password but only can be like a correct long and they can use
repeating patterns representative numbers so generally a reducing in the space of possible possible so they can be
of course it might be worth it it is if it stops people from using password of ranking better and extreme version of the we can look at as at the statistical
properties of a random sequence and see that they are consistent with the relativistic position predictions but when taking the randomness of the views of
5 we use the very informal tests for the rule of thumb which is that the average of the values was what was to be expected in random sequence and restraint and this this by looking at the low frequencies for different needs and see that the problem is but we have much elements to assess whether this is sufficiently rank of or distance from the world and it's only a little bit more quantitative at a much revelation is that she's corpus which is this of the
6 he's is set of that confirms the certain distributions but the general idea is to measure of the squares of the differences in reality but the values weighted by the expected value all and summing them so this
gives that sort of a measure of how much or observed frequencies deviates from what we would expect probabilistic early in the real world it with this measure we got what they were like this that he was the likelihood of observing different values of this
grammateme and if it is true do then what we we will conclude that the sequences this to the foreign or is too much deviated from but we will expect in a random sequence uh but also and different from the application of this text if it is too low then the the sequence of suspected to be you tool uniform for being granted other tests check this sequence from a complex patterns for instance in random sequences pairs of numbers need to be as uniformly distributed as numbers themselves or we can check if there gaps it and successive appearances of the same number and 1 of the length of that is consistent with a with with what we
would expect probabilistically and the social number of other problems and that we can use as a matter of fact the 1st and the and that
can be used to check at random sequences there's 1 by the American and I is the which of a series of formation of mathematical test as a series of programs that can isolate and a list of random numbers so can see more generators are random enough our on the other hand those some extended this led the miscegenation about
tests and that proved that this random numbers in some repeated regions like the spacing of embarrassing around random but relational or it tries to play circles and see which was overlap in the plane and many other our random experiments that we know what that's all what you expect and would it said that it is the performance of our random or some of the random sequence that taking into account the stress of randomness
but the generators have been devised 1 right the brazilian and
congressional generator which saw recurrence or would take initial value and use this equation to produce a chicken and of course is realized that I didn't have to repeat itself with a period no greater than n but it would be right used for a CNN we can and we can get a reasonably large sequences that exceed very good this this the coverage the growing with a set of them is that is very easy to fall into the situation all uniform numbers look
random once you when the block pairs of them based on that exhibits this kind of behavior and they're all falling and in the same straight lines we can choose better a to get rid of this behavior but it it always ends up happening at the dimension the so how do we get rid of it and the standard deviation from true random behavior 1 the most interesting is another and propose the maximum at somewhat undercutting Ishwaran which consists of a
large linear fit useful dimension a way that permits as it
was a very very large period of 2 to the 19 28 thousand but until more or less of it is also interesting that this sequence has internal state and uses that internal states to produce the actual random numbers so even if you know the
random number themselves and develop predicting immediately and the next number in the sequence you need a lot sample of him and and if you actually measure the statistical properties of the sequence but it it produces they are very very close randomness and and they exceeded or or this this we are correlations in in many dimensions of 633 dimensions so it is a very good random number generator what is the set of all
participants have made it a very popular generator this beginning in many languages and the Python is 1 of them but the by the random His messenger services its underlying default number of random random number generator there's also the question of how get to random numbers that people because you give this part of the company obtained from algorithm because I wouldn't have the music of so they have to be gathered from system activity only news and some other UNIX systems provide a source of random numbers in dev random and that is few by an entropy boom that that I've randomness
from various sources like keyword people sort of timing of mouth movements knowledge in the sound the memory the phrases such as and so when you have them to when users needs random numbers they can get through a random number from this book and of course getting random numbers so the poor sort of being beautiful rendering motion so
when it directly usable with more and and besides the regular source of further assume a system like at the keyboard or the
mouse and modern computer systems of them interoperate some form of hardware random number generation inductive stable bridge family a common dedicated random number generator in so now we have the right to the actual random Hamilton and the new
patterns found in real it starts from
this generator of numbers it is your mind on a from retributive and provides a lot of other interesting distributions based on that so the way it is it is there is a class in the model random which can be seen
as and and that provides a method random that's going to produce a sequence of numbers are from June 2 1 if we can use CD relevant to repeat with the same sequence all the users and
several times if we don't then we can't just let by them seeded with number get from that the random or from the number of seconds at the time of the call the have from real random numbers it is 0 when it is very
easy to get reality or in the year random numbers in all to a certain number we just multiply it around on wheels by the maximum value if you have a specific grants and you have to you random number to the width of the range and then after the by the start of this is still very is very easy and the emitted from the random and you also need to add a certain steps in this sequence of possible random numbers and user-generated media alter the number of steps and then after the way start and you have your intermediate so we can generate random real certain generated random Indians and if we need to include because of that that the whole interval numbers of so when within number of in a and B including a and B beginning is a special function branding that just false friend range with the appropriate documents but we can also be the awareness of a from celebrations of random operations in a sequence of phrases we might want to get a random element in the sequence which is very easy is generated in the in the range of indexes for the sequence a big and the element corresponding to that index if we the sample we use for the process several times if we want sample without replacement we need some form of tracking of which numbers we have already in a big as there are 2 ways to do it again tracks which numbers you can still think in least and remove from their marriage and a pigment or detainees assessed tool remember
which number you have already peaked our the heart which is more efficient depends on the size of the population compared to the size of the sample and you wonder that and the by members will actually computes this on the fly and use the more efficient you also might want to shovel it had been used
by the random model is the future dates shuffle which is just the list and exchange every region with a randomly picked up for with another 1 of the
standard if we needed to these of course destroys the sequence of so even if you a simpler way to do it that gives us a new lease we can just sort by a random key this is not as efficient but but it's much more simple now we may be interested in front of the real numbers that have another distributions other than uniform 1 but how many
about it what let's consider the normal distribution and it is at the moment 2 parameters Music Lab and in this field because know which we have numbers we can know the probability of picking it up with this what is normal distribution and but we can also read the of the a random variable with this distribution falling before every real number and this is called the cumulative distributive and the cumulative distribution function for the Bible I would consider this always increase and this means that the success of normally distributed random numbers with then generate a uniform random number and all of that will
be the from that ways that we will use brother reading in this plot and then we can check to the 1 x those that probably the corresponds and therefore the that selection is going to be an number this does not only apply to the normal distribution with 2 solution for which we know distributed function
but but it is not always obvious or easy to generate the embarrassed collective distribution function just from looking at the distribution function so there are many mathematical objects that have been devised to use these combinations on so for instance or for the normal barrier this mission we use sort of
mathematical thinking what would be the random numbers them to generate a point in a typical and the
x and y coordinates of the point and end up being normally distributed and from there we can get a number of interesting distributions the that we might do you might know from science engineering
like the triangular distribution government distributions and the prior to the commission or do we want a solution to the spread of because it's so it can be a problem use of approximate the other ones the
another 1 of node is that when mice distribution there is sort of like a normal distribution but for angles measured because when we
have angles with incident several angles may actually correspond to the same point in this initiative so the efficient it's wrapped around the signal to consider the effect of this of dislike double and angles for every point and finally the random of great for the instance of random class and provide
admittance s module methods so you can simple random and if you don't care about the state of the generator during those used the model functions that is units of like for affecting implications for because you need to independent generated for different experiments you can actually instance a glass and entered the manually it also subclass of the random class to write your random number generator and the debate and random model comes with it with here for but with compatibility the reasons and as an example of how how to provide your own random number generator and thus also see some random will and good numbers from the system provided a random number generator meaning in UNIX systems and this in a library that will connect to the random the dog 7 and use that as a source for random numbers so if you draw a random numbers you can use this and she is all of the other methods rely only on these generators in all 3 number from 2 the 1 and then they will
still work even if changing the source of the actual numbers so controlling the division of randomness is more philosophical than a mathematical problems that we can use mathematical definition is a very
useful for our proposed and active if you need sequences of the music but became as a friend and we can use to the random number generation but if we need number that completely unpredictable it sources event of a like input devices noise measurements or other external mental phenomena and for most of random number and it's by the grace more than adequate and finally will like to talk about the book that inspired his stuff this is a very good book the base very
short basic program and users literally criticism techniques that
civil war it sounds for effective but is actually a very interesting book and a couple of randomness In this with cause the interesting interesting to this very very beautiful topic and the understanding of preventing wanted to off of this book is about
random numbers it is very juridical but it's also very funny the civilianize mathematics and and finally a joint to read a little bit more there's a series of really good articles of randomness in Dorothy baked that
might help you understand why is found this important in the the every good description the 2nd link there's a very good description of the whole of how statistical testing of random numbers and if you want to read more about the possible back there in the receives random number generator disaster began article very good so that will be
interim
Randomization
Lecture/Conference
Latin square
Expression
Module (mathematics)
Code
Module (mathematics)
Quicksort
Sequence
Randomization
Computer animation
Computer
Military operation
Software developer
Directory service
Module (mathematics)
Predictability
Impulse response
Arithmetic mean
Word
Lecture/Conference
Series (mathematics)
Newton, Isaac
Arithmetic mean
Mathematics
Video game
Process (computing)
Computer animation
Root
Term (mathematics)
Operator (mathematics)
Mathematical analysis
Theory
Series (mathematics)
Computer animation
Information
Lecture/Conference
Lie group
Number
Randomization
Random number generation
Process (computing)
Computer animation
Cartesian coordinate system
Sequence
Number
Point (geometry)
Statistics
Random number generation
Theory of relativity
Sampling (statistics)
Instance (computer science)
Event horizon
Sequence
Category of being
Word
Summation
Computer simulation
Sample (statistics)
Computer animation
Simulation
Frequency
Computer simulation
Computer animation
Average
Cellular automaton
Source code
Range (statistics)
Digital signal
Uniform space
Euler angles
Sequence
Number
Musical ensemble
Algorithm
Randomization
Numbering scheme
Random number generation
Key (cryptography)
State of matter
Pseudonymization
Electronic signature
Message passing
Telecommunication
Website
Information security
Error message
Product (category theory)
Random number generation
Meeting/Interview
Lecture/Conference
Multiplication sign
Cartesian coordinate system
Sequence
Information security
Number
Default (computer science)
Random number generation
Computer animation
Sequence
Number
Predictability
Random number generation
Computer configuration
Instance (computer science)
Sequence
Ellipse
Computer animation
Personal digital assistant
Scientific modelling
MiniDisc
Measurement
Position operator
Random number generation
Computer simulation
Computer animation
Lecture/Conference
Natural number
Computer
Computer hardware
Sound effect
Quicksort
Mereology
Computer
Table (information)
Metropolitan area network
Simulation
Demon
Random number generation
Topological algebra
Information
State of matter
Streaming media
Number
Tabu search
Pseudozufallszahlen
Computer animation
Lecture/Conference
output
Uniform space
Metropolitan area network
Randomization
Random number generation
Computer animation
Data acquisition
Dilution (equation)
Instance (computer science)
Arithmetic logic unit
Sequence
Form (programming)
Category of being
Randomization
Topological algebra
Computer animation
Average
Gamma function
Sequence
Number
Predictability
Medical imaging
Randomization
Process (computing)
Spacetime
Computer animation
Validity (statistics)
Prediction
Instance (computer science)
Measurement
Number
Group action
Spacetime
Computer animation
Prime number
Family
Revision control
Category of being
Randomization
General relativity
Computer animation
Lecture/Conference
Representation (politics)
Pattern language
Prediction
Position operator
Number
Randomization
Element (mathematics)
Distribution (mathematics)
Bit
Distance
Sequence
Rule of inference
Computer animation
Average
Square number
Ranking
Software testing
Subtraction
Thumbnail
Metropolitan area network
Complex (psychology)
Randomization
Length
Instance (computer science)
Cartesian coordinate system
Likelihood function
Measurement
Sequence
Number
Frequency
Computer animation
Lecture/Conference
Pattern language
Software testing
Quicksort
Uniform space
Metropolitan area network
Computer programming
Series (mathematics)
Mathematics
Electric generator
Random number generation
Computer animation
File format
Electronic mailing list
Software testing
Sequence
Number
Asynchronous Transfer Mode
Dialect
Randomization
Theory of relativity
Random number generation
Electric generator
Computer animation
Lecture/Conference
Stress (mechanics)
Right angle
Sequence
Metropolitan area network
Standard deviation
Randomization
Topological algebra
Block (periodic table)
Set (mathematics)
Line (geometry)
Sequence
Number
Recurrence relation
Frequency
Maxima and minima
Computer animation
Hausdorff dimension
Equation
Initial value problem
Computer icon
Frequency
Random number generation
Computer animation
Linear regression
State of matter
Hausdorff dimension
Sequence
Category of being
Goodness of fit
Random number generation
Cross-correlation
Lecture/Conference
Hausdorff dimension
Sampling (statistics)
Set (mathematics)
Sequence
Number
Metropolitan area network
Default (computer science)
Entropy
Randomization
Algorithm
Random number generation
Service (economics)
Topological algebra
Multiplication sign
Source code
Boom (sailing)
Code
Mereology
Number
Formal language
Computer animation
Oval
Module (mathematics)
Quicksort
Physical system
Random number generation
Lecture/Conference
Bridging (networking)
Keyboard shortcut
Computer hardware
Right angle
Computer
Family
Stability theory
Inductive reasoning
Form (programming)
Topological algebra
Computer animation
Lecture/Conference
Scientific modelling
Distribution (mathematics)
Pattern language
Number
Social class
Randomization
Random number generation
Computer animation
Lecture/Conference
Multiplication sign
Sequence
System call
Number
2 (number)
Trail
Randomization
Context awareness
Process (computing)
Random number generation
Clique-width
Multiplication sign
Element (mathematics)
Range (statistics)
Sampling (statistics)
User-generated content
Sequence
Number
Maxima and minima
Flow separation
Subject indexing
Maxima and minima
Computer animation
Hypermedia
Operator (mathematics)
Special functions
Statistics
Form (programming)
Computer animation
Key (cryptography)
Scientific modelling
Real number
Electronic mailing list
Electronic mailing list
Information systems
Sequence
Random number generation
Cumulative distribution function
Real number
Correspondence (mathematics)
Distribution (mathematics)
Normal distribution
Moment (mathematics)
Parameter (computer programming)
Functional (mathematics)
Field (computer science)
Storage area network
Number
Plot (narrative)
Computer animation
Selectivity (electronic)
Family
Curve fitting
Point (geometry)
Metropolitan area network
System call
Vapor barrier
Random number generation
Computer animation
Logarithm
Distribution (mathematics)
Cellular automaton
Combinational logic
Mathematical object
Functional (mathematics)
Computer animation
Distribution (mathematics)
Triangle
1 (number)
Number
Point (geometry)
Metropolitan area network
Randomization
Normal distribution
Distribution (mathematics)
Sound effect
Incidence algebra
Instance (computer science)
Flow separation
Computer animation
Angle
Doubling the cube
Lecture/Conference
Internet service provider
Vertex (graph theory)
Social class
Hidden surface determination
Randomization
Random number generation
Topological algebra
Electric generator
State of matter
Scientific modelling
Source code
List of unsolved problems in mathematics
Division (mathematics)
Instance (computer science)
Functional (mathematics)
Number
Arithmetic mean
Goodness of fit
Computer animation
Hill differential equation
Physical system
Social class
Library (computing)
Computer programming
Musical ensemble
Random number generation
Source code
Port scanner
Fault-tolerant system
Sequence
Measurement
Event horizon
Number
Computer animation
Noise
output
output
Series (mathematics)
Randomization
Mathematics
Goodness of fit
Random number generation
Computer animation
Causality
Bit
Statistical hypothesis testing
Random number generation
Computer animation
Lecture/Conference
```  649 ms - page object