We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The soul of the beast

Formal Metadata

Title
The soul of the beast
Subtitle
Everything about Python's grammar
Title of Series
Number of Parts
118
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The audience will discover one of the core pieces of the language that sits at the middle of the decisions about what new rules can or cannot be implemented in the Python programming language. They will learn how the particularities of the grammar limit what can be achieved but also serve to maintain the language consistent, powerful but straightforward. Attendants will learn how core developers solved some challenging scenarios that arise as a consequence of said limitations or how others cannot be resolved unless Python gets a significant transformation in the internal mechanism that parses the grammar. Finally, they will learn how a new rule is added to the CPython grammar, serving as a perfect example of how all the pieces come together. In summary, the audience will gain a more technical response to why people perceive the Python programming language as easy but powerful one and at the same time will gain some insight on how to understand and extend the pieces that form it. This talk will not only help members of the audience understand better the design of the language a how grammars and parser work, but will also help people wanting to contribute to CPython understanding the general structure of the compiler pipeline and how to work on it. Who This talk is for those that want to understand Python a bit deeper: not only how everything works under the hood but also what are the technical decisions in its making and what are the consequences. The talk is targeted to all Python programmers, no matter the skill level as everyone will find something for their particular level of expertise: Beginner programmers will be introduced in the topic of language grammars and will learn what a Grammar is and what are the building blocks. Also, the audience members in this level will gain insight into how everything is thread together in CPython. Medium and advanced programmers will learn some in-depth technical details and how they relate to features they already know and understand. The talk not only will try to enlight some new areas related to grammar technicalities, parser features and design and CPython implementation details but will also connect many pieces of information to explain how the small technical decisions impact the bigger picture. Outline Who am I What is the Python Grammar What is grammar? How they look like. Elements: terminal symbols, nonterminal symbols, productions. The properties of Python Grammar? Leftmost derivation 1 token lookahead No epsilon productions! (Plus what epsilon productions are) Some immediate consequences of these properties. How the Python parser generator works General structure of the parser generator. Non Deterministic Finite Automata Deterministic Finite Automata. Some examples (with cool graphs!) generated from the python grammar and the parser generator of the actual finite automatas that Python uses. Concrete syntax trees. Advantages of the grammar (or ""why Python is so easy to understand) LL(1) grammars are context-free (no state to maintain while parsing). LL(1) grammars are simple to implement and very fast to parse. LL(1) grammars are very limited, keeping the language simple Disadvantages of the grammar: Grammar ambiguity. LL(1) grammars need some hacks for very simple things. How keyword arguments were incorporated in the grammar with a hack: The grammar rule is very strange because it is ""fixed"" in the Abstract syntax tree Why parenthesized with statements cannot be implemented (with statements formed of multiple elements surrounded by parenthesis and separated by commas). Implementing a new grammar rule in CPython: the arrow operator : A complete mini-tutorial on how to introduce a new operator: A - B that gets executed as A.strongrarrow/strong(B). Altering the grammar and generating the new parser. Introducing a new token. Changing the tokenizer. Changing the Abstract Syntax Tree Generator. Changing the compiler. Implementing the new opcode. Implementing the strongrarrow/strong protocol. The future and summary of the talk: We have been discussing in the CPython discourse to change the parser generator to something more powerful. Dangers and advantages of other parser generators. What other implementations are using? Summary of the talk
Keywords