Exploring our Python Interpreter
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 32 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21227 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Interpreter (computing)Goodness of fitCore dumpForcing (mathematics)Data miningGame controllerInferenceLecture/Conference
00:41
Self-organizationEvent horizonProjective planeRight angleNatural numberGroup actionComputer animation
01:14
SummierbarkeitStudent's t-testScheduling (computing)Core dumpResultantStatement (computer science)Source codeLetterpress printingExpressionPattern languageComputer animation
01:59
Software developerCASE <Informatik>Letterpress printingLaptopElectronic program guideReverse engineeringLecture/Conference
02:35
Form (programming)Direction (geometry)Software developerComputer animationSource codeXML
02:55
Metropolitan area networkPoint (geometry)Parameter (computer programming)Reverse engineeringHydraulic jumpTraffic reportingComputing platformChecklistRight angleDivisorFlow separationSoftware testingData structureSource codeTheorySoftware developerPresentation of a groupCore dumpCloningElectronic program guideSource codeXMLProgram flowchart
04:01
Witt algebraCompilerSource codeGraph (mathematics)CompilerElectronic program guideVirtual machineEmailElectronic mailing listMessage passingDependent and independent variablesComputer programmingCodeLink (knot theory)Flow separationInformation systemsVideo gameCondition numberComputer animationLecture/Conference
04:55
Open setInternet service providerElectronic mailing listStudent's t-testCore dumpValue-added networkSoftware bugDependent and independent variablesSoftware bugOnline helpArithmetic meanEmailInterpreter (computing)Semiconductor memoryPoint (geometry)Message passingSoftware developerError messageParsingElectronic mailing listRandomizationCore dumpXML
06:20
Goodness of fitCASE <Informatik>Arithmetic meanSoftware maintenanceProjective planeLoop (music)Intrusion detection systemEmailSoftware bugElectronic mailing listMereologyLecture/Conference
07:13
Port scannerElectronic mailing listSign (mathematics)Source codeWebsiteParameter (computer programming)StatuteCodeXMLLecture/Conference
07:43
Density of statesSoftware developerDataflowSoftware testingXML
08:05
Electronic mailing listFinite element methodDensity of statesDataflowData miningSoftware testingRight angleKernel (computing)Source codeLecture/ConferenceXML
08:36
Software developerBitDataflowHuman migrationSoftware testingDensity of statesPatch (Unix)StapeldateiMessage passingLecture/ConferenceXML
09:10
Multiplication signStapeldateiRevision controlNeuroinformatikOpen setComputer programmingDirectory serviceSource codeProcess (computing)Branch (computer science)InformationPoint (geometry)Computer iconPatch (Unix)Software bugDifferenz <Mathematik>Lecture/Conference
10:40
Directory serviceMetropolitan area networkMultiplication signFormal languageFormal grammarEndliche ModelltheorieProcess (computing)Library (computing)Graph (mathematics)Directory serviceMereologyParsingModule (mathematics)XML
11:30
Directory serviceGrand Unified TheoryInterior (topology)Electronic mailing listHigher-order logicMereologyObject (grammar)Directory serviceModule (mathematics)ImplementationResultantComputer programmingData dictionaryLibrary (computing)Lecture/ConferenceXML
12:08
SoftwareFormal languageCondition numberExpert systemTelecommunicationLibrary (computing)Lecture/ConferenceXMLComputer animation
12:38
ExpressionTable (information)ResultantComputer animationLecture/Conference
13:11
ACIDLine (geometry)Interpreter (computing)Metropolitan area networkCompilerSequenceInterpreter (computing)Computer programmingWordMereologyComputer animation
13:39
Line (geometry)Library (computing)Computer fileSoftwareWindowResultantCASE <Informatik>Software developerLecture/ConferenceXML
14:08
ParsingMultitier architectureSource codeAuthorizationCellular automatonMultiplication signToken ringString (computer science)Type theoryLengthExpressionProgram slicingIntegrated development environmentData storage deviceGoodness of fitLecture/ConferenceXML
15:12
Functional (mathematics)Real numberParsingCodePressureWeightBit rateLecture/Conference
15:39
String (computer science)ParsingMetropolitan area networkInformation systemsNumberExpressionContent (media)Metropolitan area networkEndliche ModelltheorieCASE <Informatik>Equaliser (mathematics)BytecodeToken ringElectronic mailing listParsingXML
16:37
CompilerParsingParameter (computer programming)Mathematical optimizationCompact spaceCodeInterpreter (computing)SoftwareMaxima and minimaSource codeBytecodeCodeInformationEndliche ModelltheorieLecture/ConferenceXML
17:18
ParsingCompilerParameter (computer programming)Interpreter (computing)SoftwareBytecodeStructural loadLecture/ConferenceJSONXML
17:53
CompilerParsingStack (abstract data type)Virtual realityVirtual machineInterpreter (computing)Vector spaceRight anglePhysical systemLetterpress printingBytecodeResultantLecture/ConferenceComputer animation
18:55
CompilerParsingBytecodeStack (abstract data type)Pointer (computer programming)Interpreter (computing)Software testingSource codeTerm (mathematics)Information overloadMereologyDivisorCASE <Informatik>Physical systemLecture/ConferenceXMLProgram flowchart
20:06
ParsingCompilerGrand Unified TheoryArmMetropolitan area networkMereologyStack (abstract data type)Source codeStructural loadBytecodeSemiconductor memoryVirtual machinePhysical systemInsertion lossCodeLogical constantResultantData storage deviceFrame problemRoboticsNumberForestPopulation densityStreaming mediaSound effectBinary codeProduct (business)Lecture/ConferenceXMLProgram flowchart
21:39
Physical systemLine (geometry)Source codeCodeFunctional (mathematics)Frame problemEvoluteLecture/Conference
22:05
CompilerFunction (mathematics)Maxima and minimaMIDIFlagTheoryCodeFunctional (mathematics)CompilerDifferent (Kate Ryan album)Line (geometry)Endliche ModelltheorieBytecodeJSONXML
22:32
Computer programmingFunctional (mathematics)GEDCOMGoodness of fitCodeComputer animationLecture/Conference
23:25
DebuggerSource codeRevision controlPatch (Unix)BytecodeRight angle
24:23
BytecodeMetropolitan area networkEndliche ModelltheoriePoint (geometry)Lecture/ConferenceSource code
24:44
Internet forumMereologyDifferent (Kate Ryan album)CloningRevision controlOnline helpFormal languageCondition numberState of matterSummierbarkeitCompilerGroup actionJSONXMLLecture/Conference
Transcript: English(auto-generated)
00:00
And will you join me in welcoming our speaker, Stéphane Wirtel, and his talk about the Python interpreter. Thank you. I think I'm not in a good conference because, in fact, it's just exploring the Ruby interpreter. No, sorry.
00:20
Just kidding. Okay, my name is just Stéphane Wirtel, in French, Stéphane Wirtel. I come from Belgium, where we have some beers, and they force them, and the Python force them. Okay, of course, I'm a Python lover, since the first of the decade. No, okay, it's not a real important.
00:41
I'm not CPython core dev, just that. It's just an introduction. And yes, I'm just a small contributor to CPython and UNICOR, if you know the project UNICOR, and, of course, CPython. I'm a nominated member of the PSF, Python Software Foundation.
01:00
Of course, we can become a member of the PSF, and since two or three years, I'm a member of the Rope Python Society, the organizer of the Rope Python event. Welcome. So, just a reminder, I'm just an introduction, and I'm not a core dev, okay?
01:23
Just a contributor. If you want to contribute, just create some patch and send it. So, about the schedule. The schedule will be really simple. We have how to start with CPython, not how can I write a print statement or just that.
01:42
Just how can we read the source code and try to understand it. We will have a small question, just how to create, what's the result of Python of this expression, two more, two? And a small summary, okay? So, how can we start?
02:02
That's a good question, sorry. Maybe, we have the developer's guide. In fact, when you want to start to develop with CPython, that was my case at Python in Montreal, sometimes we can find some sprints.
02:21
There is a sprint on CPython, and that was my case where I came with my bag and my laptop, and I just asked to some developers, hello, I would like to help you with CPython. First thing, so, can you read the developer's guide, the dev guide, in short? Yeah, of course, I can. In this document, it's just a small document, sorry.
02:47
Here, please. No, it's not the direct. It's just, it's just the developer's guide. If you want to start to develop to hack and CPython,
03:01
you can read this documentation. You will read how to make a clone of the repository, how to become a core dev, how to, for example, you want to add a new keyword in the syntax of Python. There is a small checklist, in fact, a checklist with 20 points to verify, okay?
03:23
Just that. Yeah, how to become a, sorry. The dev guide is really interesting because we have some explanation about this, the tracker, the builders, the Python developer pack, the paths, and you can read everything.
03:41
So, that's really interesting because you have the getting started, how to compile Python, how can you help, for example, with the documentation, with the source code, or with everything. How to write a test and just how to run the test on several platforms. Okay, so, come back to the doc, my presentation.
04:05
So, the dev guide will explain the quick start, the grammar, how to change the syntax, and just the design of CPython compiler. Yes, we have the source code in Python. There will be the lexer, the parser, the pip holder, and just the py code at the end with the virtual machine.
04:22
So, you can find the documentation at this location. When you start, okay, I'm sorry, I have a question, how can I, not for you, but for me, when you go in a sprint, I have an issue, I have this issue, how can I fix it, how?
04:40
I don't have time, but you can send a message to this mailing list, Python Mentors. Python Mentors is just a big mailing list program where you can send a request and you will get some response from Guido van Rossum, Brett Cannon, David L. Murray, Victor Steiner, myself, and, of course, other developers.
05:02
That's really interesting because you can discuss about the solution for your issues, okay? In my case, I wanted to modify the interpreter, just the lexer, because I found an issue, an error. And when I send my message, my mail,
05:20
I receive a response from Guido where he told me, sorry, but, in fact, in Python, there is not one parser, but two parsers for the syntax, okay? So, of course, you want to start, where to start? You have the mailing list.
05:40
We have the mailing list, we have the anons, the bugs anons. In fact, when you create new bugs in the bug tracker of Python, you have a mailing list for that. You can follow it just to receive some notifications. When you want to discuss about a bug, you have one mailing list. This mailing list is just mapped
06:01
with the bug tracker, with random. If you need some help, you have the mentorship mailing list. If you want to discuss about one big point in the core of Python, you have the mailing list, the Python dev. If you want to create, if you have an awesome idea,
06:22
for example, the fat Python project to try to improve the performance of Python, you can try to submit something on the mailing list, the Python ID's mailing list, and you will see if you have a good result, a good feedback or not. In this case, that was not. And yes, today we discussed about the performance
06:43
with asyncio, with uvloop, and the rest. If you aren't tested by the speed of Python, you can discuss on this mailing list. That's very useful. It's a mailing list where we discuss about the internal parts of Python, okay? Not about how to use the best,
07:00
what's the best practice for the performance. Okay. So, how to contribute. Firstly, that's really simple. You go on the bugs.python.org website, and you create a user account. With this user account, you have to sign
07:21
a contributor agreement with the PSF, because the source code is the owner of the source code is the PSF. Looking for one stuff. Yeah, sorry, excuse me.
07:42
So, the step two, just how can I prove CPython? Firstly, you have the documentation. Please, we have a good documentation, we have some missing tutorials. For example, asyncio. We have the documentation about asyncio,
08:00
but we don't have a tutorial about that. How can we start with asyncio, can we use it, and the rest. We have a reference in the documentation. If you want to contribute, that's the right place. Yes, of course, you can create some issues, fill them, and if you find a bug, of course.
08:20
Or if you want a feature, a new feature. I'm going to show a feature, a small feature. We need some reviewers, yeah. If you look the source code of Python, per day, we only have 10 commits per day.
08:41
It's not really big. If you want to contribute, just review the patches, and we will be happy, okay? You will receive a good message, thank you, because that's really interesting and important for us. Sometimes, you can create a patch, propose it. Just create an issue, propose a patch, and rest.
09:03
Ah yes, and sometimes, that's really interesting because you have created your patch, and you can wait for six months before a review because we have some two or three reviewers. If you want to contribute, it's a good place and a good time.
09:23
Yes, of course, the program, the process is really slow. I have some issues, open issues, and they are open since two years. Sometimes, that's really difficult for me because, but why my patches not melt in the source code?
09:42
We don't have time, sorry. Yeah, the last point, just, we try to migrate Python to GitHub. Bye bye, make real. Yes, yes, yes, you can create account. You can use your icon on GitHub and just create a pull request. I prefer that.
10:02
Usually, when you try to create a patch, firstly, you download the branch. You create your patch, and you create a diff. This diff file, you will send it, you will upload it to the bug tracker. If there is a new version, your patch is just outdated.
10:22
So, okay, and now, what can we do? Just, firstly, when you start, maybe we can try to find the directories of Python and try to understand them. In fact, how can we find information? Firstly, with the documentation, the doc directory, just the manual of Python,
10:43
where you will find the syntax, the reference of the language, the reference of the libraries from the library. You can buy some books. David Bisley or Doug Elman have some good books. That's a good reference. The grammar directory, just the grammar,
11:03
where grammar is defined. It's just a text file, just that. If you want to modify it, you have the grammar directory and the parser directory, because if you had a new keyword, you have to lex them and just improve the parser
11:21
and the STL and the bytecode, of course. You have the lib directory just for the Python library, Python modules. For example, you have the telnetlib. If you want to modify it, it's just in this directory. For the modules directory, there's a C part of Python.
11:42
For the object, for example, you want to learn the implementation of the dictionary of Python, you can go in the object directory. We have the programs directory. It's just the Python executable,
12:01
because Python is a small executable. And Python is a library. You can load the library if you want to embed the Python in your software. Okay, about the documentation, we have the reference for the language, the reference for the library, the reference for the C API.
12:22
If you want to learn, you can read the documentation. And sincerely, we want a small tutorial. Who is an expert of AsyncIO? Okay, we have a new fix for you. So, just, yes?
12:41
Another guy, yeah? No? Oh, shit. No, it's really boring, because you have Victor Steiner and Andrew, they are discussing on the table near the lunch, and they try to improve the documentation about AsyncIO. They want to create a tutorial.
13:02
So, I have one question. Just one. Really, just one? What's the result of this expression? Okay, four. It's not very difficult. But for me, I don't want to know this value. I prefer to see the common line,
13:24
the lexer, the parser, the interpreter, the compiler, everything about that. And when you start to modify Python, okay, you have the Python part, but I'm just interested by the C part. When you execute the common line, you have that.
13:42
Firstly, we have the common line, of course. The common line is just executed by the python.c file. The python.c file will load the Python library. You can try on Windows OS X or just on Linux, you will get the same result. Okay? If you want to embed Python with your software,
14:02
because you have developed a software in C++, just use the Python library. Okay, the Python, I don't know. When you will execute the source code, automatically we will initialize C Python, and we try to load some models and read the source code,
14:25
convert it to an AST and execute it. So, the lexer. The lexer is just defined, if you are interested, of course, is the topic of this talk. You have the tokenizer.c. The tokenizer will take a string, a Python string,
14:44
will convert in some keyword, some expression. You have the first talk and the tokenize, that's good. For example, we take x equal two plus two, we have this result.
15:01
Where is my mouse? We have six token. Each token has one type and value. Okay? You can learn with that, if you want to use it for a disassemble of, if you want to disassemble Python. Yeah, you know that with Python 3.5,
15:24
we have a new keyword, two new keywords, I think await. In fact, it's not a real keyword in Python. It's just a function of the definition of your code, a keyword or not. The parser is really smart about that. If you will make an example,
15:44
a thing equal true, in this case, it's just a name, not a keyword. If we try, I don't have my code, no, sorry. Yes, if we check, it's not in the keyword list of Python.
16:01
It's just, yeah, a name. So, now about the parser. You have your tokens. You can convert them in an AST. AST is just that. Okay? For this expression, x equal two, more two, we have a model, we have a body. In the body, we have an equal, and the equal is just name, we have an ID, equal x.
16:24
And we have the had, where we will add the two numbers. For the compiler, you have the AST. I would like to convert it to the byte code. Yeah. Just execute this source code.
16:42
Compile, you have your tree, the AST, and you can convert it to the byte code. With this model, we can see the byte code. If you want more information, you can read this documentation, this path. Yeah, I know. Okay, for the byte code, in the C part,
17:03
we have a definition of pi, yeah? Definition. The byte code is just a compact numeric code. One byte. Not a word, just one byte. The byte code is just portable, and the byte code is just followed by one parameter
17:23
or by many parameters. Yes, it's just used by the virtual machine, in this case, the software interpreter. For the byte code, when we have this empty file, and try to convert it, we will receive a byte code.
17:43
The byte code is just nothing. Load comes to zero, and return the value. Just an empty. When you create a new file, an empty file, the interpreter will execute it. If you try with this function, and try to convert it, you will get the result of the byte code, okay?
18:04
You have the byte code. After that, we try to optimize the byte code with the pip holder. For example, we have x equal two more two. The system will convert it to four, okay?
18:22
You don't want to try to add two more two. Example, another example. If one print hello, we have this byte code. If zero, there's nothing. We remove the dead code, okay?
18:41
Via the pip holder. So, now we have the pip holder, we have the byte code, how to interpret it. The interpreter is just a virtual stack machine. This virtual machine will execute the byte code. It's just a stack. We push an element, we pop it. We execute something, we pop it, okay?
19:01
So, a small example, where we try to create a small interpreter. Maybe there is a bug, I didn't test it. But, an interpreter is just, we have a stack, a pointer on the instruction, the current instruction, and we run, we read each instruction.
19:21
In this case, I just create a small byte code. Example. Firstly, I try to push five, push again three, and push them. And the rest is just add, add, and pop. When I'm going to read the source code, the byte code, via the interpreter,
19:41
I push five in the stack, push three, push 10. Just add, I will pop the two last elements, and I get a 13. After that, I will add another value, take the second, the two value on the stack,
20:00
and get 18. It's just add. I can get the pop. I will empty the, I will erase the stack. So, do you remember this distinction? Just add. We have the byte code, okay? And, yeah, we have the byte code.
20:21
Just add, huh? And now, we have the C part of the byte executed by the virtual machine when you execute the byte code. For example, we have the load fast. Here, in the example. The load fast will execute this code. Just, okay, I'm going to push something
20:40
in the stack of the memory of Python, in the stack frame, in the frame. For the loss count, I try to get the value from the constants in the global, so it's just in the locales, and try to use them and just push in the stack. When I try to use the binary had,
21:02
the hub code, the system will check if it's a string. If not a string, okay, maybe a number. If it's a string, we will create a concatenation, and if it's a number, we just add one, two, one, and one.
21:20
And after that, just push the result on the stack. For the store fast, we have the source code of Python. Okay? So, for the rest, and just for the fun, if you read the source code of Python,
21:40
no, I think it's sleeping. Yeah, I'm sleeping. In the source code of Python, we have the evolution, what time? Four minutes, okay. We have this function, pyeval frame x. The system will try to read the source code and eval the source code.
22:02
The main function is just this function, pyeval frame, a function with 2,000 lines of code. Just one function, okay? And there is a hack in the function, because sometimes, some compiler, C compiler does not support a feature,
22:21
and we create the default, where in the default, there is another switch for the next bytecode. Okay? So, I have a summary. The summary is just we need to improve the documentation, review some patch, and try to improve the issues
22:40
if you have any problem with Python, and just that. That's really fun. No, so silly, yeah. I like, because when you put, it's on my key, because I'm not a code developer, but I try to add, I am a contributor to C Python.
23:03
Yes, good. Good for you, Ged. No, a small example, I wanted to show you about an issue, not an issue, a small functionality for me. Come on, where are you?
23:22
No, bye-bye. Okay, come on. I just modify C Python, just with a small patch. And sometimes, I want to learn the bytecode, okay? And I would like to create a small debugger where you see on the left, the source code,
23:41
the Python source code, and the right bytecode. Okay, I would like to create that. Here's my example. Bye, yeah, come on. Okay, is the last version of Python, the version of today. If you print hello, there is just hello.
24:01
There is a missing, there is a feature in Python, and you don't know, okay? If you want to see the bytecode of C Python, this feature is in the source code of Python
24:21
since two or three years. It's not you, okay? If you define L trust, you will get the bytecode and the value of the argument, okay? So, what's the next point? Yeah, I know, two minutes.
24:41
Where's my, sorry, my mouse. Yeah, that's all. Thank you very much, Stéphane. Can I just say, it's absolutely wonderful to know that someone can go from not being a contributor
25:03
to going to a conference and becoming one. So, three years ago, Stéphane could not have given that talk, and now he really is an actual contributor to Python, anyone can do it. Not that I'm saying, Stéphane isn't a wonderful person. But it makes it real, you know. So, is there a quick question before we move on to the next talk? Go on, one question, you can have the honor
25:20
of being the person that asked that question. It will make you special. Let's do it. Yes? Is the documentation available in different languages, and is help needed in those translations? I'm sorry? The documentation, is it available in different languages,
25:42
like in French, in Italian, Spanish, or is it only in English? No, the documentation is just in English, of course, because those are France. But I know that in France, there is a group, the FP, they try to translate in French. No, if you want, you can download the documentation,
26:03
of course, it's just a clone of the repository, try to translate it. And in the last version of, last, since two or three years, you have a feature in things where you can translate, you can create a French or Italian part of the documentation.
26:21
Okay? All right, join me in thanking Stéphane once again. Thank you very much. Thank you very much.