Add to Watchlist
Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask
31 views
Citation of segment
Embed Code
Formal Metadata
Title  Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask 
Title of Series  EuroPython 2014 
Part Number  95 
Number of Parts  120 
Author 
Przymus, Piotr

License 
CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
DOI  10.5446/20018 
Publisher  EuroPython 
Release Date  2014 
Language  English 
Production Place  Berlin 
Content Metadata
Subject Area  Computer Science 
Abstract  Piotr Przymus  Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask Have you ever wondered what happens to all the precious RAM after running your 'simple' CPython code? Prepare yourself for a short introduction to CPython memory management! This presentation will try to answer some memory related questions you always wondered about. It will also discuss basic memory profiling tools and techniques.  This talk will cover basics of CPython memory usage. It will start with basics like objects and data structures representation. Then advanced memory management aspects, such as sharing, segmentation, preallocation or caching, will be discussed. Finally, memory profiling tools will be presented. 
Keywords 
EuroPython Conference EP 2014 EuroPython 2014 
Series
Annotations
Transcript
00:00
now
00:16
we have a talk from get appear removes the difficult names sorry so good and everything you always want to know about Python know about memory implies and so and yeah but were afraid to ask things have fun and I get float to yeah and remember which thank you very much and
00:45
things of that nature so that's about it for your own cover all the topics of item memory because the subject is to to complex and I had to to something but I will try to do my best and probably I would run out of time and so if you have any questions please especially during the lunch and after lunch going back home OK so few words about me that I'm still the students and I work on some of receptor
01:14
system that's the glass Copernicus University my main interests and scientific interest that the basis and gpgpu you computing a try combine 2 and also I did some some of the stuff we see that the mining I had of these 8 years of fighting experience and I did some with the bible I these here 3 of them as so I was working with that I was responsible for preparing parts of a trading platform for and of the management of the company because most attractive the contribute covariance sometimes also responsible for preparing a mussel biomonitoring analysis and that the mining software for a laboratory and now we are thinking about commercializing and for my PhD thesis I prepared to assimilate at the reduced processing environment for evaluation of the bicycle is at the end of the world and I mention this products you and Burgulence because all of them had something in common they are what they are wearing a memoryintensive they where 1 training and during this computations they tend to grow it morning and at some point out that I decided that I have to know something more about how the title manages the memory and what are the size of the of the of different types and what are the strategies for locating different containers and so this will be mentioned in the fused to sections and later on try to say a few things about the Marmara providing tools so let's stop this in basic stuff I have found the simplest possible ground and I also teach the percentage of students and after I don't know if demands there really no what are the sizes of different types in the and the forces of class and in spite of this knowledge isn't required for you so actually you know you believe you can you don't even have to care about what what is the size until some point and your application is large enough and it's OK so lot of Mory's then you have to start to think about it what is the what are the size of different by fact so I won't go in we did this with this stable led to some some interesting stuff OK you have 1 of the reasons that that's a long in by tool and the Indian by country are actually limited by your memory so as far as having and enough moral can you can at a lot of number and you have to to also know that the sets are pretty large you compare them to do you see also just that and about because there is other there's some over head of government collector and other things and it's also want to note and how those things and unique answers are represented in memory so we have lot do not have the and we also have to pay to the bytes for each element in your the community called the same goes for typos where you actually the problem but even larger here at the and you have to to pay 4 and 8 bytes for each other shows you can do it yourself
04:51
and from by Python to speak to have to get size
04:54
of and you can look at it and can and through that site by which restrictions so all the built in into the objects will return you correct results of views on the Jewish party libraries it might be eat Maritornes some crazy stuff so be aware of that and actually calls the size of metal 2 and the 2 of the additional garbage collector overheads that don't out
05:28
the garbage collector so let's do
05:31
something more interesting to
05:34
he was a fun example because creating this is from and here we creates a to lists with the same size and here we we fix the number associated with this form and here is exactly the same number of plaster something more noting that there will be any difference between allocate memory allocated for being between was think 1 interesting tho so this is a fun example so yes there should be something different and as you
06:06
can see an the size of of the 1st surrealist is actually less than half of the of the 2nd so this is because of object entirely so what that so
06:20
OK and by the way there's a general rule form
06:24
critical objects so when we create object and assigned to the 2 of our variables so this object is created and the and the signs so we divide variables don't just want this objects they don't hold them more and you don't of all that is our exception this it's mainly Due to performance optimizations slow down in in the and it's highly implementationspecific all the examples from this presentation are from the fighting and actually it might change over time and there was this 1 change in the Python plementation about object in their and uh during the time so the what is the the internal couples so often used objects are preallocated and instead of creating every time that we I don't know if the side a uh equals 0 so that 0 woman created all the time will be it will be shared among all their responses so here we have been called that through and that's the that so here we go assign 0 to a and B and if you write a is b will get strong and of course values also through and here in this example you can see that we assign meaning of a large number of and a is the will of to false and the values are of course the same actually I know someone special to me an amazing tests from 2 days ago and there was similar question there but both there's still this is highly Python implementation specific so let's
08:11
talk about was something more about the object and the now use of and this is I will say this once more this is despite implementationdependent this may change in future and probably will this is not the comment that the bike on the commendation for programmers and if you want to reference for those of us who values here you have to console the OK so and see I 2 . 7 to before we got an inter object intended for integers from this uh we also have some of the candidate for strings and unique coating by 2 and by country and unique open string seen by John uh 2 and pattern 3 answers turn will be 4 empty strings and all strings that are like that of the length of them is equal 1 and with the restriction for the unique called for only the matching 1 symbols and also to an empty tuple is another example of object but through shared among OK know
09:26
something but the different still internal it's drinking during so we start with simple example we'll create constraints almost here we will not the missing collateral today and try the same and to get also of course but if we use in of this is for Python so at and try this 1 and 2 that that's true so and let's try to use it for or something people who creates a lot life wait both things and students 57 megabytes of memory resident memorials and if we do the same with the internal here we actually reveals that memory usage but what's actually happening when we use terms so stinking during
10:26
almost became but that the fusion some semantic for storing only 1 copy of the 2 distinct strings what we have to remember that they should be in what tables so for but by entitlements we would have functional interim and you by country it was relocated to this this model so we have this environment and if you use this function you actually and the Austrian into the table of internal things get that references to data and that's the way it might matter actually the same string about was already turned or it might be a copy of the string and when can we use it so we can get this also from the commendation convexity of little performance on dictionary lookups and and some of your vitamins we'll be ultimately the government so for in programs and actually dictionaries of cold model class in the sense of articles have Internet he's the end of the previous examples
11:33
we can also reduce their ultimate reduce the space used when we we have a lot of the same using our good OK
11:44
so let's say something more about the table containers there are different in comparison by only dictionaries and actually behind the signs is strategy for locating uh contained so uh and it's just that you will try to the for growth of shrinkage so to better from gold was slightly over all take memorial each time we oftentimes on the alignment to always we won't have to relocate the memory uh and you know in our system and so we leave the room for for growth and then we also have to remember that sometimes we have to feed their allocated more for for the table contain and so this will reduce the number of expensive function class like my copy and so on on and of course we will try to use an optimal whilst for performance reasons so
12:43
respond to it's very very simple example and this is always 1st I mean put on elements for and into the latest we'll get an allocation but not for 1 element of 4 core elements and after that if we append something returned some thinking that we will we will have free up and so it's memorial operations free so we have compared to we can put another element another and then another when we put the elements we have to we have to so although by whom we will relocate the are for a 4 hour lists and we'll prepare for more than so how does it exactly
13:29
where so listen by the sovereign as a representative fixed of flanked are pointers so we just point to objects and thereby design view of the overall located the police or the beginning it will be something that looks for a lot of these it will be less than this percentage so OK some consideration about performance due to memorial actions involved when you think please we actually uh when we put some kind and of the of the least this is this a version of and what if we put something in the middle of or in the beginning we will have to copy the memorial should the memorial tool the performance operations and it is also a watermelon that's for 1 2 and 5 them these we look we waste a lot of space so if you have large number of Somalis have to allocate
14:30
over allocated for more than
14:33
OK and when this here's the overhead of broken chorus and you have to pay this price for each element for the need for a different architectures and the shrinkage of the this will happen when I look at the number of lemmas that we use we go below the half of the look at space OK and that's to talk
14:59
about locating for dictionaries and that's it breaking similar and books you will although we we we will all were allocated when we reach that years of of and dictionary or sets actually for small dictionaries and small small sets will quadruple the capacity in 1 day 2nd order it is big enough for you double the capacity to extent the more and then we'll have to calculate actually the use size 4 for this object and I look at the memorial and the shrinkage of the dictionary offset to happen when we the light a lot of the did you removed the last summer of OK from on basic now another
15:50
example of what we can represent data environment pyruvate represented in different ways so we can use all type of things like glass that you can use lots can use named tuples tuples these and dictionary and I repudiate on example
16:07
from the book to by Koehn 2004 for apprentices and but other it's more objects and covered some more fields uh and actually can see how did they differ for storing the same data that doesn't by defining different types so as you can see which are not some restrictions because you when you put as slots into your classes but a lot of restrictions for this task but you can gain uh memory minimization goes for leave you get on that's memorial use for a for those those classes OK so
16:44
notes on the garbage collector and reference counts by so actually as it probably all uh but has a garbage collector and do collects objects when the authors found goes to 0 there are some information on some of the operations that increment reference counter some there are some of the operation of the detriment of the reference counter but there is no warning candidates putting that into the office of the communication that if you actually overruled down and you can have problems because if you have because I don't garbage collector graph can deal with cyclist uh in uh in object references boats when you use that the it's not possible for about 5 months to get to the correct order of the using development as the objects in some on the cycle so actually uh this cycle won't be the allocated from from remember
17:46
OK I have some more time so I will talk about this and use me and but you can use for a White and remember providing analysts star would stop using those pretty simple it's actually trust that system for roman move for up API for a system utilities and so actually to get some information about current process Memorial can just use the price of the process and get your get off your process and transform this information done dictionary and then you can return the simple information the for most of the examples I just the could the cold but because it's most relevant for corpus for this that the proposed and to another to reduce memory profile and so it recommends to to use P a feudal system so this is good to have that you laughter dependencies into a cluster and there number of final my work in 3 different modes so you can get a line by line so you can get memorial that moment monitor and become to the use of some of them got to go so let's start with the alignment provided you have to put in your
19:09
coats profile of the crater on the function we want to provide profile and then you can run start with something like that of course you should be the name of the of of the code and then we will get such results that you get linebyline Memorial that's and the increments from the from the memorial which for each and here we see that the the for loop you the the main uh minimum memory contributor and the 2nd way that we can use the memory requirement is by uh using the it's also memory usage monitoring time so you we will just when you turn them brought system where to do that and actually you can use it for any type of process not only for uh not only for the title but if you want to use with Python should both to provide the correct for functions that you want to track and surrounded with the option Python and here I run simulation and here and here is the result I got a plot and here are some things that you would like to add that the convex function markets us as the the 1 that those those of my operations here so I see that probably uh connect is responsible for the growth from here and there and functioning market here at frequency it doesn't change out from from our memory and that the adoption
20:38
forms from memory providers to use it as a double figure and so we can set up a stressful of MIMO use memory and run our process and then we will get and I will step into the the mother when we reach that memory of the set of the direction and another
21:00
1 another tool use Object athletes cool tool for visualizing the object references in by them and so actually for small projects is pretty cool because it gets the such plots like this and it's a good tool for
21:19
finding the reference cycles in your in your code if your product is laden half the people but will we generated will be pretty large and be have hard to check something there but we tasks some cold and manipulation and there's sold all this to our using in in the tutorial for enough you can actually track down the object reference to that site pretty pretty even this
21:45
OK the next 2 tools probably there will be covering more
21:49
in there to talk from the session chiefly and malaria and there are a few is the tool there are are treated the same as which some differences but the pretty good distracting in on the on the part of this project and so let's
22:08
see what we can do with that room so we actually can run codes and do our hips snapshot here and and some more Memorial extensively in intensive operations and on future of not a huge snapshot and that we actually can do some medics on on those
22:28
hits and get such a result so we see that we allocated a lot of integers entities with this 1 operation and another
22:41
and is a combination of many I answer on the ground so actually you can use this tool to about all the objects in your coat and then use the ranks snake uh with that of memory that's about it and get such interactive plot so we can this means and how to see how the of how the memorial located for different different objects OK
23:09
and this is almost the
23:11
end of my talk so and you can also use different molecule plementations with it's pretty easy and and you will find probably many block and entries about using different the memorial craters and so it's got some problems those so that you can actually game we which saw very little endurance neuronal called some additional at better but memorial process to system memory retrieval and that's the that's also gone so actually my works against you so it depends your application back so if you want to use a different uh molecule plementation you have to be so of course the from the different libraries and then you can run the Python which of the field and with the part the uh library want to use and you can get different results I actually prefer some small does so my called got some service that center used small Jermoluk and the model with the same code and you can see how the more it changes in different places so as you can see from our work you actually melodies actually pretty good now on Python there was something of a few years ago there were some problems the number now it works pretty pretty good what's the answer conceived and look you can get pretty the same results and trouble for a different applications uh you can gain something but we fishing model for discovered this example actually if you end up with a little bit more memory allocated not return to the system but again this depends on the application side that you will of of application back some other
25:08
useful tools you can always say you
25:12
can always built by combining that multi can you say about language Python pretty good cooperate with it you can use the experimental extension for gdb and probably for most of the Web developers you can use 1 of those of those er at those probably more convenient because outweighs the middleware version of terrifying memories of this 1 and you can just put it in your with the and that gets on memorial profiling as
25:44
a summary and try to understand better the underlying Memorial don't pay attention to what sports user profiling tools c can destroy this is actually the hardest part so try to find the root causes and think the memorial so probably the next the next talk will be about this and is thus also also we can do this solutions sometimes the t so we can delegate Memorial uh into under the proof of that monitoring this operation on the time of the process processes and collected the results and then to the process of stop the process and you can actually rest of the process if you just get them as much memory overhead and also can always go for a hanging fruits like loss or try different moral patients and just some great differences that they use Europe
26:34
when I was preparing this presentation so give it a try some of them are all the data like this 1 because they need insights about quite a lot of memory inside so thank you very much which focusing included
26:57
sometimes on a question so please come to the microphone over the rotating give some Christians I've experienced it sometimes that I've had to create many objects in Python and then that I removed all the references to them and actually force the garbage collector on yeah but this is the memories that wasn't 3 so and this just time widest Python sometimes not Freeman it's actually made the plants from the version of Europe by the interpreter that this is 1 and 2nd 1 it that sometimes the moment the moral code we have problems with returning the memorial it's a little bit more complicated but we can try to try the different them all allocations libraries that irrational and try to see that it is it will help you with your problem yes of course do you have any hands you should this be be and so on and you already have have himself to the work of the memory problems so what I experienced sometimes using PsychoPy due to play the whole process and was using like for gigabytes of memory but he the only showed that very few the memory of stuff so I guess it was related to his something yeah solid western is the question is do we have any himself to the a dblock like those of the the memory problems and so you can either going to try the devil version of Python some component with the devil version and then you can see that objects that were and be allocated by the garbage collector and the plants use walgreens if you want to go low level so we can talk about the normal OK thank you're much
00:00
Berlin (carriage)
Computer animation
Computer cluster
Spherical cap
Knot
00:37
Query language
Multiplication sign
Sheaf (mathematics)
Disk readandwrite head
Mereology
Unicode
Table (information)
Type theory
Strategy game
Object (grammar)
Personal digital assistant
Process (computing)
Social class
Physical system
Covering space
Process (computing)
Product (category theory)
Interior (topology)
Basis (linear algebra)
Student's ttest
Data mining
Database
Simulation
Data type
Data management
Point (geometry)
Algorithm
Numerical digit
Mathematical analysis
Student's ttest
Bit
Plot (narrative)
Hand fan
Number
Hypothesis
Performance appraisal
Lecture/Conference
Natural number
Data mining
Integrated development environment
Regular expression
Maize
Normal (geometry)
Subtraction
Computing platform
Data type
Computer
Autocovariance
Forcing (mathematics)
Uniqueness quantification
Element (mathematics)
Length
Mathematical analysis
Coma Berenices
Set (mathematics)
Cartesian coordinate system
Performance appraisal
Number
Word
Integrated development environment
Software
Scheduling (computing)
04:49
Addition
Overhead (computing)
System call
Overhead (computing)
View (database)
Latent heat
Positional notation
Computer animation
Object (grammar)
Website
Speicherbereinigung
Object (grammar)
Implementation
Resultant
Library (computing)
Data type
Extension (kinesiology)
05:28
Overhead (computing)
System call
Electronic mailing list
Coma Berenices
Electronic mailing list
Total S.A.
Number
Latent heat
Resource allocation
Computer animation
Object (grammar)
Speicherbereinigung
Implementation
Integer
Subtraction
Resource allocation
Form (programming)
Data type
Extension (kinesiology)
Wide area network
06:04
Implementation
Presentation of a group
Multiplication sign
Electronic mailing list
Total S.A.
Rule of inference
Variable (mathematics)
Latent heat
Sign (mathematics)
Virtual reality
Mathematics
Object (grammar)
Software testing
Implementation
Form (programming)
Exception handling
Rule of inference
Metropolitan area network
Point (geometry)
Variable (mathematics)
Latent heat
Arithmetic mean
Resource allocation
Computer animation
Dependent and independent variables
Object (grammar)
Integer
Mathematical optimization
Exception handling
Mathematical optimization
08:10
Tuple
Query language
Length
Student's ttest
Demoscene
Unicode
Open set
Summation
Programmer (hardware)
Virtual reality
Video game
Term (mathematics)
Object (grammar)
String (computer science)
Integer
Implementation
Source code
Constraint (mathematics)
Uniqueness quantification
Interior (topology)
Length
Latin square
Code
Range (statistics)
Symbol table
Singleprecision floatingpoint format
Computer animation
String (computer science)
Uniform resource name
Mixed reality
Pattern language
Object (grammar)
Tuple
10:23
Computer programming
Data dictionary
Intel
State diagram
Scientific modelling
Computer program
Attribute grammar
Instance (computer science)
Data dictionary
Functional (mathematics)
Table (information)
Table (information)
Social class
Pointer (computer programming)
Computer animation
Internetworking
Convex set
String (computer science)
String (computer science)
Key (cryptography)
Module (mathematics)
Pairwise comparison
Social class
11:32
System call
Set (mathematics)
Multiplication sign
Electronic mailing list
Data dictionary
Number
Summation
Sign (mathematics)
Strategy game
Social class
Pairwise comparison
Spacetime
Coma Berenices
Thresholding (image processing)
Functional (mathematics)
Table (information)
Number
Resource allocation
Computer animation
Oval
String (computer science)
Function (mathematics)
Strategy game
Mathematical optimization
Mathematical optimization
Reduction of order
12:42
Point (geometry)
Spacetime
Group action
Freeware
View (database)
Electronic mailing list
Shift operator
Number
Pointer (computer programming)
Operator (mathematics)
Core dump
Representation (politics)
Website
Resource allocation
Operations research
Spacetime
Element (mathematics)
Length
Electronic mailing list
Binary file
Element (mathematics)
Pointer (computer programming)
Resource allocation
Computer animation
Computer cluster
Strategy game
Dependent and independent variables
Bus (computing)
Object (grammar)
Freeware
14:28
Operations research
Spacetime
Freeware
Spacetime
Overhead (computing)
Element (mathematics)
Length
Lemma (mathematics)
Electronic mailing list
Group action
Binary file
Shift operator
Element (mathematics)
Number
Pointer (computer programming)
Resource allocation
Computer animation
Memory management
Strategy game
Personal area network
Bus (computing)
Arc (geometry)
Computer architecture
14:58
Data dictionary
Channel capacity
Real number
Set (mathematics)
Channel capacity
Length
Code
Set (mathematics)
Data dictionary
Element (mathematics)
Table (information)
Number
Computer animation
Integrated development environment
Hash function
Order (biology)
nTupel
Representation (politics)
Key (cryptography)
Object (grammar)
Representation (politics)
Extension (kinesiology)
Data type
16:07
Maxima and minima
Programmable readonly memory
Counting
Field (computer science)
Social class
Order (biology)
Object (grammar)
Operator (mathematics)
Authorization
Speicherbereinigung
Office suite
Local ring
Social class
Task (computing)
Graph (mathematics)
Information
Cycle (graph theory)
Software developer
Lemma (mathematics)
Aliasing
Counting
Mereology
Causality
Positional notation
Computer animation
Function (mathematics)
Order (biology)
Object (grammar)
Cycle (graph theory)
Representation (politics)
Data type
Field (mathematics)
17:45
Ocean current
Code
Debugger
Line (geometry)
Multiplication sign
Home page
Inversion (music)
Plot (narrative)
System software
Sampling (statistics)
Summation
Frequency
Maxima and minima
Profil (magazine)
Computer configuration
Operator (mathematics)
Reduction of order
Process (computing)
Module (mathematics)
Physical system
Simulation
Convex function
Process (computing)
Trail
Information
Moment (mathematics)
Letterpress printing
Line (geometry)
Functional (mathematics)
Plot (narrative)
User profile
Loop (music)
Computer animation
Function (mathematics)
Computing platform
Data type
Physical system
Resultant
Asynchronous Transfer Mode
20:37
Process (computing)
Debugger
Mountain pass
Graph (mathematics)
Direction (geometry)
Projective plane
Set (mathematics)
Plot (narrative)
Inclusion map
Computer animation
Object (grammar)
Internet service provider
Graph (mathematics)
Object (grammar)
Form (programming)
21:19
Melting
Product (category theory)
Code
Graph (mathematics)
Set (mathematics)
Mathematical analysis
Mereology
Similarity (geometry)
Computer animation
Object (grammar)
Memory management
Website
Integrated development environment
Graphical user interface
Graph (mathematics)
Information
Cycle (graph theory)
Object (grammar)
Electronic visual display
Supremum
Task (computing)
21:48
Host Identity Protocol
Ring (mathematics)
Network operating system
Projective plane
Interior (topology)
Mathematical analysis
Mereology
Letterpress printing
Counting
Mereology
Computer programming
Similarity (geometry)
Computer animation
Object (grammar)
Operator (mathematics)
Memory management
Graphical user interface
Integrated development environment
Information
Absolute value
Electronic visual display
Subtraction
22:25
MIDI
Core dump
Counting
Trigonometric functions
Total S.A.
Plot (narrative)
Maxima and minima
Social class
Computer animation
Object (grammar)
System on a chip
Operator (mathematics)
Ranking
Integer
Object (grammar)
Units of measurement
Resultant
Active contour model
23:08
Service (economics)
Structural load
Code
Scientific modelling
Mereology
Variance
Number
Summation
Personal digital assistant
Implementation
Subtraction
Library (computing)
Physical system
Process (computing)
Online help
Block (periodic table)
Bit
Cartesian coordinate system
Computer animation
Mathematics
Information retrieval
Game theory
Units of measurement
Physical system
Resultant
Library (computing)
25:07
Asynchronous Transfer Mode
Debugger
Interactive television
Formal language
Leak
Revision control
Profil (magazine)
Object (grammar)
Memory management
Physical law
Statement (computer science)
Configuration space
Extension (kinesiology)
Game theory
World Wide Web Consortium
Metropolitan area network
Multiplication
Building
Letterpress printing
Maxima and minima
Resource allocation
Computer animation
WebDesigner
Revision control
Energy level
Middleware
Extension (kinesiology)
25:44
Overhead (computing)
Presentation of a group
Multiplication sign
Perturbation theory
Insertion loss
Mereology
Leak
User profile
Root
Operator (mathematics)
Row (database)
Speichermodell
Process (computing)
James Waddell Alexander II
Subtraction
Source code
Process (computing)
Red Hat
Principal ideal
Information technology consulting
Proof theory
Causality
Root
Computer animation
Resultant
26:48
Process (computing)
Code
Multiplication sign
Connectivity (graph theory)
Moment (mathematics)
Bit
Solid geometry
Revision control
Lecture/Conference
Interpreter (computing)
Speicherbereinigung
Energy level
Object (grammar)
Resource allocation
Library (computing)