A Python for Future Generations
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 160 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/33687 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 2017135 / 160
10
14
17
19
21
32
37
39
40
41
43
46
54
57
70
73
85
89
92
95
98
99
102
103
108
113
114
115
119
121
122
130
135
136
141
142
143
146
149
153
157
158
00:00
SoftwareTwitterLevel (video gaming)Open sourceContext awarenessProjective planeDegree (graph theory)QuicksortWebsiteMultiplication signSoftware frameworkPoint (geometry)InternetworkingBit1 (number)Medical imagingGreatest elementTraffic reportingMereologySlide ruleComputer animation
01:09
Formal languageQuicksortIntegrated development environmentType theoryLatent heatLibrary (computing)MereologyNumberInterpreter (computing)Computer programmingMultiplication signProcess (computing)Run time (program lifecycle phase)BitProteinStandard deviationInstance (computer science)CodeLecture/ConferenceComputer animation
03:30
Data typeCodeNumberBinary codeOperator (mathematics)Object (grammar)Representation (politics)Group actionEquivalence relationForm (programming)TorusMultiplication signFormal languageInterpreter (computing)Decision theoryQuicksortSocial classTerm (mathematics)Instance (computer science)Standard deviationParameter (computer programming)Type theorySequenceHypermediaSet (mathematics)Data structureArithmetic meanCapability Maturity ModelStructural loadLecture/ConferenceComputer animation
06:12
Operator (mathematics)Formal languageOperator (mathematics)Type theoryLevel (video gaming)Instance (computer science)Interpreter (computing)Content (media)Formal languageAdditionNumberMathematicsParticle systemSequenceMultiplication signObject (grammar)QuicksortFunction (mathematics)Mereology1 (number)CodeProxy serverResultantState of matterTracing (software)Complex (psychology)Endliche ModelltheorieRadical (chemistry)Software testingElectronic mailing listSound effectMetropolitan area networkConsistencyPattern languageNeuroinformatikStandard deviationPressureFunctional (mathematics)Software bugCommunications protocolObject modelLecture/Conference
10:44
Computer fileFreezingStability theoryScripting languageLecture/Conference
11:20
Patch (Unix)Extension (kinesiology)Modul <Datentyp>12 (number)Radical (chemistry)Standard deviationComputer fileFunctional (mathematics)Table (information)Set (mathematics)CASE <Informatik>Patch (Unix)Social classAdditionActive contour modelModule (mathematics)Instance (computer science)Extension (kinesiology)Binary codePoint (geometry)Computer programmingRun time (program lifecycle phase)Formal languageDifferent (Kate Ryan album)Multiplication signInstallation artMereologyDegree (graph theory)LogicBitOvalConstraint (mathematics)CountingStandard deviationDefault (computer science)Exception handlingInterpreter (computing)Revision controlObservational studyScripting languageText editorRule of inferenceEndliche ModelltheorieControl flowGroup actionPerturbation theoryVector spaceTask (computing)Software developerRadio-frequency identificationWeb pageLecture/ConferenceComputer animation
16:02
Forcing (mathematics)Web browserStandard deviationGravitationQuicksortLecture/Conference
16:37
CodeInterpreter (computing)Level (video gaming)Product (business)DivisorFigurate numberBitFormal languageInterpreter (computing)CodeComputer animation
17:09
Compilation albumUsabilityCurvatureChainCompilation albumInterpreter (computing)Inheritance (object-oriented programming)Computational scienceCodeFormal languageComputing platformSpacetimeOperator (mathematics)Graphics tabletService (economics)Subject indexingMereologyObject (grammar)String (computer science)Projective planeInstance (computer science)ResultantMathematicsBitExtension (kinesiology)Module (mathematics)Multiplication signComputer fileWeb 2.0Server (computing)Software developerIntegrated development environmentHierarchyIncidence algebraProcess (computing)Data managementPerfect groupLink (knot theory)Bit rateGroup actionLecture/ConferenceComputer animation
20:21
Extension (kinesiology)Process (computing)Goodness of fitInformation securityThread (computing)NP-hardLibrary (computing)Personal identification numberCartesian coordinate systemRun time (program lifecycle phase)Bus (computing)Connected spaceFood energyLecture/Conference
21:12
Run time (program lifecycle phase)Formal languageModemDefault (computer science)CodeRevision controlLibrary (computing)Installation artMultiplicationModul <Datentyp>Standard deviationInterpreter (computing)SubsetExtension (kinesiology)Ultraviolet photoelectron spectroscopyFormal languagePairwise comparisonGastropod shellInstance (computer science)Module (mathematics)Library (computing)Run time (program lifecycle phase)Revision controlComputer fileProjective planeVariable (mathematics)Network topologyFigurate numberFunctional (mathematics)Different (Kate Ryan album)Decision theoryVideo gameBookmark (World Wide Web)Set (mathematics)Multiplication signCodeDescriptive statisticsData structurePoint (geometry)BitSubject indexingMetadataPattern languageQuicksortPhysical systemSubsetType theoryDegree (graph theory)Data conversionComplex (psychology)Extension (kinesiology)CodeSimilarity (geometry)Dependent and independent variablesMathematicsTerm (mathematics)Regular expressionSocial classBit rateSoftware frameworkDirection (geometry)Graphics tabletMultiplicationHeegaard splittingProof theoryCore dumpCASE <Informatik>Standard deviationWeb pageControl flowPatch (Unix)Utility softwareTransport Layer SecurityBuildingFlow separation
28:46
Formal languageStandard deviationUnicodePhysical systemString (computer science)Operations researchPlane (geometry)Escape characterModul <Datentyp>Information systemsMereologyInheritance (object-oriented programming)Independence (probability theory)Endliche ModelltheorieObject (grammar)Representation (politics)Instance (computer science)Extension (kinesiology)Exterior algebraFormal languageOperating systemPoint (geometry)Server (computing)String (computer science)CatastrophismInterface (computing)BitHypermediaSet (mathematics)File systemWorkstation <Musikinstrument>CodeSummierbarkeitNeuroinformatikoutputWindowAsynchronous Transfer ModeComputer fileModule (mathematics)Power (physics)Similarity (geometry)Line (geometry)Observational studyBit ratePattern languageDependent and independent variablesLink (knot theory)Type theoryError messageInequality (mathematics)Cellular automatonSampling (statistics)Library (computing)ResultantMultiplication signMedical imagingLipschitz-StetigkeitLogical constantRevision controlAnalogyIntegerVolumenvisualisierungInterpreter (computing)Heat transferMathematical optimizationCartesian coordinate systemUnicodeASCIIStreaming mediaTemplate (C++)QuicksortCodierung <Programmierung>Data conversionFigurate numberPlanningWritingBuildingFile formatBookmark (World Wide Web)Default (computer science)Flow separationBinary codeSubsetTable (information)Different (Kate Ryan album)Utility software32-bitTheoryComputer animation
36:20
Extension (kinesiology)Data typeModul <Datentyp>SCSIMathematicsEndliche ModelltheorieExtension (kinesiology)Module (mathematics)Revision controlSoftware developerComputer configurationFunctional (mathematics)Run time (program lifecycle phase)CodePhysical systemCodeDataflowMultiplication signKey (cryptography)Context awarenessMereologyUtility softwareParameter (computer programming)Entire functionSocial classObject (grammar)Frame problemStandard deviationMappingError messageType theorySource codeInstance (computer science)Domain-specific languageInterface (computing)AreaLibrary (computing)Function (mathematics)Computer programmingFormal languageTwitterProgrammierstilFile formatRight angleProcess (computing)WordVariable (mathematics)NumberState of matterFlickrPattern languageGroup actionDigital photographyMomentumAddress spaceArithmetic meanoutputLecture/ConferenceJSON
41:07
Hacker (term)WeightCASE <Informatik>Lecture/Conference
42:07
Library (computing)Hecke operatorCycle (graph theory)Event horizonHacker (term)Lecture/Conference
Transcript: English(auto-generated)
00:04
Hello everybody, so my name is Armin, I do lots and lots of Python for many years at this point. Most of you might know me from the Flask framework which is probably the most popular project I made and as of some time ago I'm
00:22
working on Sentry with like commercial sort of level which is also a project that came out of the Python community and it's an open source project at its core. If you want to find me on the internet this is where you can do this. Also the slides of the talk will be at the URL on the bottom
00:42
and hopefully also on the website of the conference. So this talk is sort of the idea of raising a little bit of awareness of how Python actually works for us as a community and maybe how we can evolve it to some degree. So
01:01
part of what made me do this talk and then two similar ones I gave before was that as you start programming more and more you get involved with more than sort of the language where you started out. So Python is sort of my home but I also use Rust, JavaScript, Ruby, many other things and when you start using
01:22
something very often you are amazed by some of the things it does better than sort of what your home language is. But then as you start using it more and more you also realize that they also have problems. So I want to bring some of the experiences from other environments a little bit into the
01:41
Python community sort of maybe so that we can do a better job at evolving the language. So I think the biggest question is what is Python actually? Because this question is like obviously it's the language that everybody uses but if you look into it a little bit more it actually turns out that Python is whatever C Python is doing. I think it's a very important concept because C
02:04
Python is sort of the standard Python interpreter. There's a language reference which tells you how the language is supposed to work and it becomes part of the documentation of the Python language but actually a lot of how we all program Python depends on very specifics of the C Python interpreter.
02:24
And I will give some examples of this and why this is relevant. But I think it's important to know that unlike JavaScript for instance we do not have a language standard. So a lot of the code that we use happens to work more or less because it works on C Python. So many of us will have experienced
02:44
that when we take our C Python code and run it on other runtimes like PyPy we encounter that not everything works exactly and then the sort of path taken for a long time have been to just make PyPy and other languages more like C Python. But we never had a standard. So there are two parts where
03:02
this comes up. One of them is the general language language behavior which is what happens when you add two numbers together and the second part is the standard library and there's what exactly is the standard library is also a little bit unclear. But for more or less everything that you import that doesn't come from pip is sort of the standard library and it also comes
03:21
from C Python. And usually also means that the standard library becomes part of quote unquote the language specification. So this is my favorite example of C Python code. What does this do? It looks simple like you have
03:41
two values A and B you add them together but what happens? To give you an example, what happens in JavaScript is A and B are converted into value representations which are effectively numbers and then added together and there's a if you go to the standard of JavaScript or ECMAScript there
04:03
will be an explanation of how this construct works. But this is not JavaScript this is Python and what we all learn I think is that this is more or less equivalent to this like this special under methods but this is really correct. This is sort of what you might read in the tutorial but then
04:23
some tutorials a little bit correct and will tell you that is actually like this. So you get the class of A and then you use this under under method and you pass self explicitly as the first argument which is sort of why you can't do, why you can't overwrite the add operator on an instance. So
04:42
that's the explanation often given. But is this really correct? Actually turns out that there is also this under under class is this equivalent to type or not and it turns out it's not at all equivalent. And the thing is like they are not all necessarily correct or incorrect because they are all wrong in in explaining the language because none of this actually happens. If you have A
05:03
plus B the interpreter will give you bytecode for this where it loads two values and it uses the internal binary at operation and then if you go all down the rabbit hole you will eventually realize that there is
05:20
one interpretation in these interpreter of what binary at means and it tries one of those two things. If it's an object there's a number then it will try to add them. If it's a sequence it will try to concatenate them. And this is this is something that nobody really ever looks at because it's for the most
05:42
of the time is irrelevant. So it turns out that in Python objects are actually they have nothing to do with these under under methods. Internally for the vast maturity of operations that we do there in the interpreter is a struct for all of the types and on the types there are methods and these methods are in slots and depending on how these slots are set up these
06:04
operations to different things. And the reason why it matters is because this was a design decision made a long time ago and everybody else has been forced to copy this like PyPy for instance. So for instance this the fact that an at operation can do two different things on a type level has very
06:24
profound consequences. In particular what happens if you subclass something that already has a certain setup in the interpreter and you create a subclass of it in Python. So in particular if you add an under under at method on an
06:41
object it will register. So if you go back to this there are two slots internally on the object to do addition. One of them is the if it's a number added together and the other one if it's a sequence concatenated. And if you add your own under under method it always becomes a number addition.
07:01
That's kind of relevant for the most part because that you can still do concatenation obviously in this method. But it's stashed away into the number addition part. But what if you subclass for instance a list where addition is defined as a sequence concatenation. And for quite some time there was a bug I don't know if it was just in PyPy or if it was just in
07:21
CPy from also replicated in PyPy. But it happened that if you subclass a list and you added your own under under at method then sometimes concatenating lists would do your own method and sometimes it would do whatever was there originally. And nowadays the interpreter if it finds an under under at it will also put the proxy into the sequence concatenation of
07:41
other things. So there's a lot of complexity in the language as a result of this. But the simple gist is there is no plus operator. And the reason why there's no plus operator is because when the language was created originally there was no standardized object model. So there are two internal methods. One is called PyNumberAt and the other one is
08:00
called PySeqConcat and they correspond to those methods like PyNumberAt will add numbers through the number protocol and PySeqConcat will concatenate sequences through the sequence protocol. But if actually look at the functions themselves so they're different but they will also do what the other one is doing. So PyNumberAt will attempt to add numbers first and then concatenate sequences and
08:23
PySeqConcat will first attempt to concatenate sequences and then fall back to adding numbers. So this is this doesn't make a lot of sense anymore but it still defines some of the effects that we get in a language. So why does this matter? Or does it even matter? And there are different ways in which you can look at
08:42
this but I think it kind of does matter because it limits us in what we can actually do with the language in the future. So there is CPython but there's also PyPy, there's Jython and right now I think both of those, I don't know how active
09:00
Tristan is at this point, but at least PyPy attempts to replicate every single quirk in the in the language in an attempt to be as compatible as possible to already existing code. And I think it's cool that the PyPy people are doing this but also at the same time it makes PyPy a lot
09:22
more like CPython for not necessarily the, we're not necessarily gaining anything from this. So why are they replicating all the quirks instead of cleaning it up and making it nicer? It's because everybody wants to have high compatibility. And I think this is the part where we as a community also sort of demand compatibility. Because if our
09:42
code doesn't run on PyPy we're not willing to give PyPy a chance for instance. But then if you actually look into what this means for the future it means that it actually prevents more innovative language changes, features. And if you look very far in the future what will Python look like in 30 years? I mean will it just
10:01
be the same or will the computers look so vastly different that we have to change the language? So here is small proposals like maybe we as a community can make the Python we use more like the Python we actually teach people. Maybe we can eventually achieve a Python where if
10:20
you add two numbers together it does nothing else but calling a special under add method. And this trying to achieve compatibility with stuff we had before is I think this is one of the strongest mantras in the Python community. It's a very common story. So the same
10:44
way as PyPy attempts to be as compatible with CPython as possible we as a community are building our ecosystem in very similar ways. We very strongly value compatibility ignoring this Python phrasing. And this is very well shown
11:05
with packaging. So I don't know how many of you ever wrote a setup.py file but this whole idea of a package being built through a Python script comes from distutils. And distutils was added eventually to Python and it set
11:22
up this idea that you import a function from distutils you call it it's called a setup function and then if you run your setup.py file it will execute this function it will do some magic and eventually end up with a table. This we still do. Just that now we use setup tools and if
11:41
anyone has ever seen setup tools it's an elaborate monkey patch to distutils. And the original goal was to implement something called Python X and most communities in Python with some exceptions like I think SOAP people might still do X. Python users have stopped using X but we didn't
12:01
completely stop using X. We still use the some part of the infrastructure with some of the things we're doing nowadays. But the monkey patching doesn't stop with setup tools. If you use setup tools at the time one of the things that it added on top of distutils was the idea that you could run Python setup.py develop in
12:21
which case it would build binary extensions and put it into different paths than where they would be normally so that you can develop locally with a Python package without having to install it all the time. This idea was later picked up by pip. Pip added a thing called pip install dash dash editable and the way this is implemented scarily enough is it at runtime monkey patches setup
12:42
tools temporarily to get its logic in place. And then wheel also monkey patches setup tools to build wheels instead of eggs and actually I think even the guy who wrote the beatiest wheel command set is effectively unmaintained at this point it's just doing like small
13:01
little things and I saw at least one fork where someone actually monkey patches wheel to get their own stuff in place and we're doing it too because for instance we are distributing binary extension modules with a module called snake which lets us do rust modules for Python and it's a monkey patch for beatiest wheel. So this
13:24
is like everybody is doing this. CFFI very common module also implemented as a monkey patch to setup tools. And in our case snake is not just a monkey patch to setup tools it's a monkey patch to CFFI which monkey patches wheels and if you ever look at
13:40
like if you try to do runtime introspection on the classes they're like if you add for instance CFFI if you add two CFFI modules you run the monkey patch twice so you have two subclasses internally to the default built extension command from setup tools which extends the one from distutils but instead of extending the one from distutils it's actually
14:01
replacing it with its own. It's really quite maddening that it's this way. And we could have at one point realized that this is what we're doing and maybe like reconsider. This is also I think similar to some degree the community is attempting to replace or get rid of the jill for a really long time but part of the reason we
14:22
can't do it is it's just backwards compatibility. It's not so much that it's hard I mean it's hard but it's only hard if your constraint is be compatible with everything you've done so far. So probably getting rid of the global interpreter lock would probably mean getting rid of ref counts and to break everything. So the thing is we're not really good at breaking
14:44
this compatibility. Our only attempt of doing this I think was Python 3 and it went interesting. But it was very radical in some ways but it was totally not radical enough in others. So it changed the language that everybody had to go through this pain of
15:03
upgrading but then we ended up with more or less the same as we had before just slightly different Unicode. And especially Unicode that I'll talk about this a little bit later again but I think it would be interesting to see if we can do if we can learn from this a little bit and maybe
15:23
attempt another incompatible Python version but different. Because one thing is that if you actually look at what the future of scripting languages are, like scripting language in the sense what Python, JavaScript and some others are, I think they're not going to go away but they will definitely look different. You can already see this that
15:43
async programming has become a thing that's a first class citizen in JavaScript, in Python and many others. But what people really care about is the ecosystem. And in addition to the ecosystem actually care about some standards more than
16:03
within the past. So it was perfectly okay for Python not to be standardized but it would not have been okay for JavaScript not to be standardized. So the fact that there is a very strong JavaScript standard now is what enabled a lot of the stuff in the community. People really, like JavaScript before Node and before modern browsers came
16:24
along was very very different and that was sort of a necessary step the community had to do to actually come up with a way to standardize it which Python was never forced to do. So I think if you still want to be relevant in 30 years we probably have to evolve a little bit. So here's some of the things we did
16:41
really well and this is why I don't think Python is going anywhere. The C-Python interpreter code is really readable and I think this is something that gets a lot of people interested in a language because you can actually figure out what's happening under the hood. I know this is from a lot of people that I talked with is one of the reasons why they got interested because they could see you could very easily go to a lower level and figure out what the hell is
17:03
actually happening. It also means that you're never really surprised if you run Python code in production to not know what's happening. If there's some really bizarre behavior it's straightforward to figure out what's going on. Especially also because we don't really have chip compilation it makes
17:21
it even easier. It's super easy to compile a new Python interpreter so to actually modify the language itself is I think it's easier than any other language I've ever used and this is what makes the community stronger because a lot of people are actually interested in getting their own stuff into the language and this is
17:42
is vastly different than for instance getting change into Node.js where there's even internal politics in the language that make it hard where the interpreter is actually a part of a Google project and sometimes it's not, sometimes it is. The fact that everything in Python is a compact package where you can modify it as you will
18:01
gets a lot of people interested in and also makes them feel like this is a stable platform because even if commercial support would go away they can still take ownership of the whole thing. The fact that we had this or still have to see extension modules meant that we could go into communities that our languages had a hard time going into especially scientific computing because the Python developers themselves could never figure
18:25
out all the things that the language would be used for but because other communities could come in and adapt it for the use that was really strong and powerful and this is also what made VEP very happy with Python because you could embed it into other environments like web servers and stuff
18:44
and because we are doing such a terrible job at package management and we come to multi-version dependencies it actually means that we have a lot stabler and flatter dependency hierarchies than a lot of other communities do and I don't know if you're familiar with JavaScript much but there was one of these
19:03
one of these incidents was that someone unpublished a package called pad left which added some spaces I think on the right side of a string and you would think that such a simple operation if someone deletes it from the package index it could never have an impact but because
19:20
everybody was depending on on left padding a string by some spaces somehow deployments failed people couldn't get the new code up because the build service tried to install this very boring package from the internet and as a result I was looking a little bit into what else the JavaScript community is doing and and it's really absurd in some ways
19:41
there's a there's a package called is array which checks if an object is an array and it's a one-liner in it but because everybody because you have one dependency that uses or two dependencies that are using it it's almost impossible for you to have a large JavaScript project and not also depend on this is array package and while the code in itself is a one-liner there is
20:04
I don't know it is like a more half a kilobyte of license file that comes with it there is a JSON document which describes what it's doing there's documentation in it so you actually download like 10 kilobytes of data and if you actually look what the JavaScript community as a whole is downloading is is array is downloaded in the excess of a terabyte a month
20:21
and we're not doing that and I think that's good because it makes everything a lot more predictable when we push out the security update in in a library it's very likely that the entire application will see the security update whereas with JavaScript you might have to push intermediate dependencies as well it's a very common problem that we see using JavaScript
20:43
is that there is so hard dependency pins that if you have a dependency which in itself is a dependency it might be that that dependency doesn't get a security update just because it was pinned too hard runtime introspection i think is probably python's best feature the fact that they can look at what it's
21:03
doing there there's so many nice extensions to python where you can connect to a process see what it's doing look at the threads century the company that I work for the entire origin of that project was that you could crash in python and look at all the local variables
21:22
that you had in a stack trace and that's really powerful and I would never want to see this go away it's very painful to look at javascript in comparison where there's nothing you can do it's like you can parse there's a regular expression you can parse a stack trace that's the extent of runtime introspection but here is some of the things we could probably do to make it
21:43
to make our language more future proof in the future and I feel like there's really only one thing that we should care about which is making it easier and simpler as the language core instead of just making easier and easier libraries people in the python community love simplicity they love using libraries that are they
22:02
look simple to use but a lot of those libraries that look simple to use internally do really crazy things the for instance the very popular requests library in python is used to vendor and I think it still does to some degree vendor packages but the way in which it did it
22:21
involved monkey patchings as modules and a bunch of other things and that that sort of stuff breaks and the reason it's doing that and many other libraries are also doing that is because it looks like that's one way to tame the beast but then you don't really see it until it breaks and then we could have invested this time in actually figuring out like why
22:41
why is everybody doing this can we just make a simpler solution for this problem so I want to bring some ideas from other communities into python and maybe as a community as a whole we can figure out if you can adopt this so there are two languages I want to use as a reference here is javascript which is eats the world and rust out
23:01
of personal interest and also because it's one of the last languages that appeared and as such they had a highest chance of learning from everybody else's mistakes and they did learn from everybody else's mistakes javascript mostly learned from its own mistakes which is great but um it um it definitely also picks up from other languages
23:20
so this is my favorite topic packaging and modules um javascript has one so actually I used a lot of javascript packaging now and I would never use it as a reference point to learn from but there are some things that it has done really well one of them is all the description of a package is in a file called package.json which means that this is a static file you can
23:43
use a generator to generate it if you want but you can also load it at runtime and you can figure out what it's doing and the the rust community has a similar thing with a file called cargo tumble which is like an ini file but not really but it also defines everything that is relevant in terms of metadata and in terms of installation behavior of the
24:03
library and we don't have that because we execute code to install a package and we do generate some metadata but it's generally not available it's very slow to load and the vital community never really was interested in package metadata but actually package metadata I think is the
24:20
most important thing the fact that you can access your metadata at runtime gives rust and javascript a lot of possibilities to make much nicer decisions than we can do so for instance a package can figure it out figure out its own version that's a simple thing but it can also figure out its own dependencies which means that in rust
24:42
and in javascript if you require a dependency from a package that import code or in case of rust it's the linking code can figure out what your own dependencies are to give you the more appropriate version of a library so this for instance makes it possible for a package in javascript to have its own
25:03
left pad function whereas the other package has its own incompatible left pad function but they can still work because they see their own little local references and the fact that you have multiple versions per library has its ups and downs
25:21
now leaning towards probably it has more downs than it has ups with some negative experiences made but it's not like those communities are learning from this and in particular in rust for instance um and i think in javascript community as well there is now the the talk about maybe we can find out a way to split dependencies into half where like half the
25:44
dependencies are like private dependencies which only are internal to a library and some of them are public so that for instance if you have a framework like flask and you have an extension to flask that it's guaranteed that the extension of flask will always be the same flask as your user code does so these communities are learning and we can also start incorporating some of
26:03
what they're doing where we are now is we are actually moving towards that i think very few people still run setup.py install i think we're at the point where we could get rid of setup.py at least we have the infrastructure in place to build python wheels without using this utility or setup tools at all
26:22
a wheel once it's been created is largely just a zip file and we can use different tools to generate them because pip is already a separate tool pip could be extended to support python packages which are have nothing to do with setup tools or this details but we are still away from
26:42
multi-version dependencies we would need metadata access which there is no good api for and also it's not just that we don't have a good way to access metadata we also have an import system which doesn't support multi-versioning for various different reasons but i think it would be a very realistic way
27:01
to move towards a completely new packaging ecosystem with less work than we currently collectively spent on trying to make what we have work because you just need to look at the in all the packages that we have all the issue trackers there is so much pain and suffering hidden there and that doesn't even show you the
27:22
individual suffering someone does has when when they try to make setup tools work in new and exciting ways i think i wasted about a month of my life doing nothing else but trying to make rust work with setup tools and i'm still unhappy so maybe maybe we could just channel this a little bit and and build a different packaging infrastructure and
27:41
i feel like the packaging community in python is already going this way trick here would be to actually make a language standard because nobody actually wants to have the current language as a standard i think everybody who worked with python long enough knows that they would try to simplify things so there's no
28:01
point in standardizing what we have currently and javascript just standardized what they had because the language was a lot less stuff in there than in python and when they figured out that some of the things that they had were impossible to make fast they actually changed they took away some features of the language um i don't think we as a community
28:22
wants to move the direction um but maybe there is actually something that will get us this way for instance there's micro python which clearly has had its own experiences with the complexity of the language and there is a page on it which says these are the differences between micro python and c python and maybe if we get some more like c
28:43
python slightly incompatible python versions we will actually find a common subset that makes more sense than than what we currently assume the subset is which is the entirety of c python so now i feel like maybe pi pi would have been a little bit more successful by not trying to be c python but trying to be more bold trying to do more
29:03
exciting things that people actually have a good reason for for doing this and i think the biggest problem that other python versions still have at this point is that we as a python community there is this idea if you go to the documentation like how do i build an extension module it says use this utils and setup tools to
29:20
build a python extension module and nobody ever told people that this is wrong and the documentation should really say that unless you really really know what you're doing don't do it please don't do it there is so it's like it makes no sense please don't do it because there is so many negative parts about building a python extension module with the python api um you will suffer for this for a long
29:42
time and there are so many better ways to do it like cffi where you actually build an independent library you try to consume it from python and you get away from the idea of sending python objects between your new world and the other one um but there was no never anyone in the community said like it's a stupid idea to build python extension modules and everybody still tries to do it um so
30:03
maybe we should just put it into the documentation like there are alternatives to building c python extension modules because once we get away from this we can actually liberate ourselves and use more interesting python interpreters um this is my favorite topic unicode i think we did it completely wrong and the more i use other languages the
30:21
more i'm convinced that we have unicode completely wrong this is what rust is doing they use utf-8 everywhere and everything gets easier and where they can't use utf-8 because it's not possible they use a thing called wtf8 which is wobbly transfer encoding format i guess
30:41
and this allows them to be compatible with utf-16 or ucs2 i guess in in places where they have to interface with the world that is not completely unicode aware on windows in particular and it turns out and the wtf8 also came up with javascript which for similar reasons as
31:04
windows decided that um two bytes per unicode character is everything they will ever need um and so they found new and innovative ways to deal with this problem and they just embraced utf-8 everywhere um and we should do too but it's very hard
31:21
and the reason for us it's hard is because we use strings differently than other communities are um but the benefit of using utf-8 everywhere is there's very little guessing about encodings did you know that if you open a file on python 3 as in unicode read mode or write mode it's not utf-8 by default it guesses it's encoding it's utf-8 on most of the computers you have ever
31:45
used but if i SSH into my server it's ascii because it guesses the encoding from the file system and it falls back to ascii because i unless it was changed recently but it used to fall back to ascii if it couldn't figure it out um and rust for instance decided that instead of trying to shoehorn
32:05
more stuff into the unicode type they will build a separate string type to interface with the operating system um so if you for instance use the unicode apis on python 3 to interface with the file system you will get unicode strings back unless it can't decode a file name because it's invalid then you will also
32:23
get the unicode string back but it contains characters which are invalid unicode so if you pass on the string long enough eventually will break in the same way as it broke in python 2 um just with a much more confusing error message that it contains surrogate and i remember that i had this conversation at one point
32:42
like five six years ago that the reason we don't want to use utf-8 is because everybody actually benefits from or not everybody but there are lots of people that actually benefit from only living in the basic plane which means only two bytes per character because for instance in in japanese languages that might be a more efficient
33:04
representation in utf-8 and and this also sparked sort of the the idea that in python 3 a string will attempt to stay in one one byte for as long as it can then it will upgrade to two bytes until it no longer can and only when you have characters outside the basic plane it will go to four
33:22
bytes per character um and it turns out as of i think at least two or three years ago this optimization was it doesn't make any more sense for a lot of applications because people use emojis and emojis are way past the basic plane so you're now in this really absurd situation where if you render a template
33:40
in ginger 2 it starts out with html code it fits into ascii so it's one byte per character you stream a little bit further you hit your first unicode character it re-encodes everything into two bytes and then you hit the first emoji because someone left a funny comment and it does it all over again with four bytes per character so the the world has evolved to a point where unicode is now used more than for the two bytes per
34:03
character so could we move to this idea of having utf-8 everywhere we could very easily we just have to give up the idea that we can access a character in constant time and we can slice strings but we love slicing strings in python so i think that's a little bit in the way of doing
34:20
this but i'm not fully convinced that fundamentally the the idea of representing being able to access a character in constant time doesn't make any sense and it's not useful and also that we don't need string slicing but we would have to start moving us away from doing this so that we could then start embracing utf-8 as an
34:41
internal encoding and we are very far off that i already talked about extension modules i would love to get rid of them as much as possible use more cffi and as a result use less lib python if you've tried to build a c extension in python and you want to distribute it to other
35:02
people using linux there's a thing called many linux one it's a docker image it contains a very very very old version of centers i think it's centers five it's probably eight or nine years old the reason why you build it on a very old linux is because then it's upwards compatible to the more modern linuxes it's very painful
35:21
because you can't do modern ssl on this docker container but we can build a c extension in python on a very old linux and it runs on new linuxes so in theory all we would have to do is make one extension for os 10 one extension for windows two extensions for linux one for 32-bit for 64-bit and and we would be done but because
35:42
everybody links against lib python you actually have to build one for python 2.7 2-byte character unicode python 2.7 4-byte character unicode multiply this for linux 32-bit linux 64-bit os 10 windows then you have to do the same thing for python 3.3 3.4 3.5 3.6 if you're happy and you can use the stable abi
36:05
eventually you don't have to do the two unicode thing anymore you can just assume one but to do a release of one binary extension module that people don't have to compile themselves you probably have to do like 21 24 different tables or wheels this is excessive and the only reason is lip python if you build a cfi module you can get
36:24
away with four and this is a benefit that was never really understood by the community so now you know it only use cfi if you can but it's a realistic change change to move towards cfi instead of extension modules it's impossible for a lot of libraries like numpy for instance
36:43
would not be able to move to cfi as far as i understand or anything that sends python objects around but for if you want to make your chasing parts of fast or if you want to do something where you have a utility library written in some language and you want to use the
37:01
functions in python where you don't have to pass the entire python objects around it's very possible and i think it's much easier to use as well but because i think everybody was pushed towards regular extension modules people don't even think that cfi might be even the better solution last part is something that we should steal from somewhere else
37:23
linders and type annotations i also put barber on there barber is a library for python where you can javascript where you can take javascript code and you can do stuff with it and generate other javascript code and this actually turns out to have a really profound impact on the javascript community because you can use more modern language
37:42
features on an older version of javascript and because it was accepted as being a possible path of software development there's a concept called source maps where you can still figure out where the error was in the original untranspiled code and this actually made it possible to target newer versions of javascript on very old runtimes
38:03
and maybe something like this will also be an option in the python community to use things like async functions more prominently in older versions of python maybe there could be a thing where you can transpile python to code on python 3 who knows and then obviously typescript and flow are
38:23
very popular extensions for javascript to get static typing in i think we're moving this way with typing on python 3 but we never really embraced it as much as the javascript community did also it's it's very common now in other communities to just run a program and performance your code to the one true style
38:41
and there is no arguments about it go doesn't even let you compile codes unless it follows the naming conventions we would never be able to go there because the standard library already has like 20 different naming conventions but maybe for our own code we could start to embrace the idea that there is this one thing you run maybe there would be like a flake8 fix my code style
39:02
there's some attempts in python to go this do this but there for instance one thing we learned is that if you use this one tool to format your source codes according to some standard your linter will complain about the output of this tool of it being different because the linter was written by different people than the formatter and stuff like this not great but i think this is probably
39:21
one of the more realistic trends that we have of of moving the language into a new area where we can agree on standards and stuff yeah so what can you personally do abuse the language less don't do stuff like this there is so much code that like gets a random frame and assigns a
39:42
local variable and is like called the dsl there is like soap interface for a very long time just modified the class scope through uh just get frame try to not subclass built-ins anymore there's only suffering stop writing non cffi extensions if if you can do so and just stop being clever versus modules because if
40:03
you stop being clever versus modules maybe we can make a really cool import system which lets us do multi-version dependencies one of the biggest mistakes ever made was that pickle addresses its uh types by the internal dotted name and because so for instance if you import if you at runtime want to import a module
40:23
it used to be that the under under import function was so impossible to use that everybody under under import whatever they wanted to import ignore the return value and then assume that what they imported is actually in just modules so they will do under under import food bar ignore return value and then returns
40:41
his modules food bar this obviously won't work with multi-version dependencies because just modules would have to have different keys and that was just an api design mistake that was copy pasted all over the world and now everybody's still doing this but awareness is the first step if we know not to do these stupid things anymore maybe we can evolve the language and
41:01
with that if there's still some time left i will take questions
41:22
it's actually like if someone wants to have questions is a microphone or how does it work i will just repeat the question so just tell me so the question is if you cut away the hacks
42:04
would python become less nice to use because a lot of the ecosystem actually depends on these hacks like g event and and other libraries and probably the answer is yes if you take it away completely you should never take away people's ability to experiment with this um but it doesn't mean that everything has to stay a hack forever
42:23
like i think it's nice to hack around temporarily to make setup tools nice um but instead of continuing this hack forever we could just do it uh and and say like there is a legitimate need for this maybe we can do it differently all right thank you