Python Standard Library, The Hidden Gems
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 118 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/44776 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
EuroPython 201995 / 118
2
7
8
15
20
30
33
36
39
40
45
49
51
55
57
58
63
64
66
68
69
71
74
77
78
80
82
96
98
105
107
108
110
113
115
00:00
Point cloudGoogolSoftwareWeb 2.0Software frameworkSoftware maintenanceVideo gameFocus (optics)Projective planeView (database)Source codeFormal languagePoint (geometry)Module (mathematics)WebsiteGUI widgetEndliche ModelltheorieCASE <Informatik>Core dumpFunctional (mathematics)Multiplication signVulnerability (computing)Cache (computing)Lecture/Conference
02:48
Crash (computing)ExplosionPhysical systemCartesian coordinate systemProduct (business)Software bugCrash (computing)Process (computing)CodeEmailModule (mathematics)Endliche ModelltheorieVideo gameComputer iconSoftware testingMultiplication signIntegrated development environmentError messageComplex (psychology)JSONXML
04:19
Crash (computing)Message passingIntelligent NetworkPhysical systemEmailEndliche ModelltheorieMultiplication signLoginCASE <Informatik>Server (computing)Module (mathematics)
05:05
SoftwareCrash (computing)Cartesian coordinate systemWeb applicationException handlingSpacetimeEmailMessage passingFunctional (mathematics)Line (geometry)CASE <Informatik>VideoconferencingMultiplication signJSONXML
05:55
Object (grammar)NamespaceWeb-DesignerObject (grammar)Line (geometry)Mathematical analysisCodeProper mapPoint (geometry)Data structureSocial classData dictionaryHierarchyCartesian coordinate systemNamespaceWordMultiplication signReal numberContent (media)Software testingCategory of beingContext awarenessPositional notationDynamical systemLoginParameter (computer programming)Different (Kate Ryan album)Server (computing)Declarative programmingEmailFormal languageFunctional (mathematics)Single-precision floating-point formatError messageComputer animation
08:14
NamespaceObject (grammar)Read-only memoryCategory of beingNamespaceFunctional (mathematics)Formal languageObject (grammar)Computer fileBand matrixMaxima and minimaContext awarenessMiniDiscForm (programming)Elasticity (physics)Gastropod shellSemiconductor memoryPhysical systemProcess (computing)File systemSoftware frameworkNumberWeb applicationLimit (category theory)Operator (mathematics)Web 2.0Level (video gaming)CodeVideo gameRoundness (object)CASE <Informatik>outputXML
11:21
FluxLoginRootProcess (computing)Read-only memoryProcess (computing)Cartesian coordinate systemLoginComputer fileWeb 2.0Web applicationPhysical systemProfil (magazine)Multiplication signMatching (graph theory)Semiconductor memoryMaxima and minimaSource code
12:03
Video game consoleSemiconductor memoryMiniDiscCrash (computing)Web applicationFunction (mathematics)Cartesian coordinate systemProof theoryProper mapCASE <Informatik>ResultantRadical (chemistry)BitException handlingFile formatElectronic visual displayReal-time operating systemWeb 2.0WebsiteComputer fileXMLUML
13:25
Video game consoleSoftwareTouchscreenPoint (geometry)Arithmetic progressionCASE <Informatik>WindowInformationRight angleLine (geometry)Utility softwareFile formatReal numberOperator (mathematics)
14:52
Video game consoleReal-time operating systemFunctional (mathematics)WebsiteImplementationWindowPhysical systemArithmetic progressionRadical (chemistry)ProgrammschleifeInformationResultantDifferent (Kate Ryan album)Utility softwareMereologyTouchscreenComplex (psychology)TheoryFunction (mathematics)Equaliser (mathematics)Configuration spaceModule (mathematics)Ocean currentSolitary confinementLoop (music)JSONXML
17:01
FamilyComputer fileComputer fileMedical imagingWeb applicationContent (media)Right angleCrash (computing)Process (computing)Real numberWritingMiniDisc
17:40
Cartesian coordinate systemComputer fileContent (media)Insertion lossPhysical systemDatabaseData storage deviceSoftware bugUsabilityCodeTerm (mathematics)ImplementationState of matterMiniDiscFunctional (mathematics)Endliche ModelltheorieUtility softwareCrash (computing)CASE <Informatik>Temporal logicReading (process)MereologyModule (mathematics)Interrupt <Informatik>JSONXML
19:52
Computer fileCodeCASE <Informatik>File systemContext awarenessData managementRemote procedure callContent (media)MiniDiscState of matterFerry CorstenException handlingOperator (mathematics)Functional (mathematics)WordJSONXML
22:43
Line (geometry)Content (media)Data managementWritingOpen setMiniDiscContext awarenessCrash (computing)Computer fileMereologyPhysical systemDatabase transactionCodeXML
24:12
Read-only memoryData managementData storage deviceProcess (computing)Functional (mathematics)Address spaceShared memorySemiconductor memoryPoint (geometry)Virtual machineMultiplication signDifferent (Kate Ryan album)Electronic mailing listSoftwareDatabaseMultiplicationPhysical systemMiniDiscReading (process)CASE <Informatik>Module (mathematics)CoprocessorProjective planeComputer fileDifferential (mechanical device)
26:48
TrailRead-only memoryCodeData managementFitness functionProcess (computing)Inheritance (object-oriented programming)MereologyDynamical systemVideo gameSemiconductor memoryVirtual machineVariable (mathematics)Object (grammar)SoftwareModule (mathematics)Physical systemConnected spaceWritingProjective planeRight angleDifferent (Kate Ryan album)Multiplication signData storage deviceIntegrated development environmentNamespaceSocial classComplex (psychology)CASE <Informatik>Endliche ModelltheorieSet (mathematics)Letterpress printingBlock (periodic table)Content (media)2 (number)Slide ruleParity (mathematics)XML
30:24
String (computer science)Web browserWeb 2.0WebsiteFunction (mathematics)Default (computer science)Vector spaceWeb pageEscape characterContent (media)Traffic reportingPower (physics)SoftwareInformation securitySoftware developerCASE <Informatik>Web applicationEmailVideo game
32:04
String (computer science)String (computer science)File formatSocial classMarkup languageResultantMultiplication signFunction (mathematics)Escape characterWave packetXML
33:09
String (computer science)Drop (liquid)MassMultiplication signSoftware frameworkWeb 2.0Markup languageString (computer science)Order (biology)Social classRepresentation (politics)Communications protocolSpherical capVideo gameContext awarenessCASE <Informatik>CodeLine (geometry)Object (grammar)Inheritance (object-oriented programming)Table (information)File formatImplementationResolvent formalismVariable (mathematics)JSONXML
35:30
Software frameworkMereologyWeb 2.0Software frameworkWeb applicationMagnetic stripe cardWeb-DesignerCartesian coordinate system
36:17
Software frameworkSoftware frameworkSystem callCodeMultiplication signWeb applicationParameter (computer programming)Link (knot theory)MereologyRoutingGodServer (computing)InformationWeb 2.0RootSemiconductor memoryElectronic mailing listPoint (geometry)Representation (politics)Regulärer Ausdruck <Textverarbeitung>Social classSampling (statistics)Dependent and independent variablesUniform resource locatorWeb browserRegular graphCASE <Informatik>Cartesian coordinate systemObject (grammar)Windows RegistryFunctional (mathematics)Video gameAddress spaceXML
39:35
Software frameworkTracing (software)CodeResultantWeb 2.0Multiplication signLine (geometry)Integrated development environmentMachine visionSystem callSoftware maintenanceSingle-precision floating-point formatInformationTracing (software)Ocean currentFunctional (mathematics)Cartesian coordinate systemCodeSemiconductor memoryInstallation artSoftware frameworkData managementContext awarenessWeb applicationParameter (computer programming)SpacetimeWeb browserLevel (video gaming)CausalityWordTerm (mathematics)
42:16
Tracing (software)Line (geometry)Functional (mathematics)CodeSummierbarkeitMessage passingLetterpress printingNear-ringXML
43:07
Tracing (software)CodeCodeEndliche ModelltheorieMultiplication signLine (geometry)System callLetterpress printingOrder (biology)Module (mathematics)Functional (mathematics)Row (database)InformationDataflowTracing (software)XML
44:09
Computer animationLecture/Conference
Transcript: English(auto-generated)
00:08
He already said everything, but let me have a quick introduction on what I'm doing in my life right now. I'm the maintainer of the TurboGL's two web framework, which is one of the top 10 web frameworks
00:22
in Python currently, and maintainer of the bigger project for caching and session, which is widely used by many web frameworks. Contributor to Mink, Tosca Widgets, Web Hub, Kajiki, as you can see, all web related projects. So most of my time is spent in the web world with Python.
00:40
And from their work point of view, I'm one of the founders of a company in Italy, technical advisor of three startups in Italy, and one of the engineers behind the crunch.io project, which is a great project, if you don't know it, let's have a look. And reason why I decided to speak about this topic is because I recently published a book, which is named Modern Python Standard Library Cookbook,
01:03
as you can guess, it talks about the Python Standard Library. Why I decided to write such a book when Python is full of great documentation, great books, and things like that. The reason is that most of the cookbooks that are available for Python are actually about the language,
01:21
the core features of the language. Some of them showcase some features that are available through the standard library, but most of the focus is about the language itself. And we have some great sources for documentation about the standard library, like the documentation itself, the Python module of the weak project,
01:40
and things like that. But they try to focus on how you can use those models, how they work, they act more like a reference when you already know that they exist. They are not a very convenient way to discover new models or corner cases or ways you can use those models.
02:01
And the Python Standard Library, it's really huge. I still discover new features and new models and functions every day, and I've been using it for like 15 years or something like that. And this is actually a very interesting topic because recently there was a proposal to reduce the amount of models included
02:21
in the standard library because it's so big that the burden over the maintainer is not manageable anymore. Some models are unmaintained for years because no one has knowledge about them. There are not enough people willing to maintain the standard library to maintain it at its current site. So sadly in the next years it's going to shrink,
02:42
but still it has some very cool hidden gems that I want to share with you. The first one that I want to share with you is how to report crashes from your production applications or like any kind of application actually. This is something that I consider very important for any kind of application.
03:01
You want to know that there is an issue because you get notified by your monitoring system, not because your user come and complain to you. So you want to be aware of any kind of issue before it can impact any user or before it impacts them enough that they start complaining.
03:21
And being reported of crashes, it's one of the best way that you can know there is a bug in your code because as much as you could test your application, as much as you could have a quality assurance process, the users will always discover bugs because the complexity of the world out there is much bigger than anything we can replicate
03:42
in a constrained environment like our testing systems and things like that. So this is an example of what I want to achieve which is every time the application crashes, it sends me an email with the traceback and the error. And this can be achieved actually in a very simple manner
04:02
through the logging module in Python. I guess that most of you have used the logging module at least once in your life but I think that not many people know that the logging module is actually very flexible. It can be configured in many ways.
04:20
The logging module is structured in a way that you have what's called handlers which are in charge of doing something when you submit a message to the logging system. And one of the things that they can do is actually send forward that message to you through an SMTP server.
04:40
So you can get your log by email whenever the logging system outputs a new log entry. That's of course not very convenient for logging itself. You don't want to reach thousands of email every time you write something on the logging. But if used properly, it can be a very good way to provide hints about some specific messages.
05:04
And in this case, what I'm going to do is decorate a function, usually the main of your software or the main of your whiskey application in case of web applications. We've decorated that every time there is an exception,
05:21
it calls the exception logging helper on the logger that we just configured. If you notice in the previous page, the crash logger is configured to send every entry by email. And here we tell the logger to record a new exception. The convenience is that the exception helper
05:43
actually does not only record the message, which is going to be used as the subject of the email, but it also records the trace back and line where it crashed and things like that. So if this is used properly, you are going to receive my mail, every single error that your application faces
06:03
so that you know before your users start to complain. And this requires like two lines of code. You just decorate the entry points of your application or the most important functions and you get notified. It works specifically well in the context of web development because you have a single entry point
06:21
in the whole application that it's called by the whiskey server every time it has two server requests. So every time the request crashes, you get notified for that specific request. Another cool feature that's available in the Python Slalom library very few people know about is that you have a way to implement literal objects.
06:42
If you ever worked with other dynamic languages, like probably JavaScript, because it's the most widespread one, you probably got used to having literal objects and be able to just throw around objects are located at any point in your code with any kind of properties and not having to declare classes
07:02
or proper structures and things like that. That's not the best way to organize your code, but it can be very convenient when experimenting or when writing tests that have to simulate the behavior of other objects and you don't want to end up using real logs or things like that. Whenever hacking around in general,
07:22
it can be convenient to be able to declare objects that don't require a proper class hierarchy to be created. And usually in Python, you frequently end up using dictionaries for this purpose because you can put anything you want in the dictionary. But if you have to call third parties code,
07:41
it's going probably to accept some kind of dictionary, some kind of object and not a plain dictionary as the argument. So the problem with dictionaries is that they do not support the dot notation for the access to their content. And that's the reason why in the standard library,
08:00
we have something like the namespaces where you can actually set any property you like, they act like a dictionary, nothing different. But the major difference from dictionaries is that you can assess all the properties that are inside the namespace with the dot notation. So in practice, you can create namespace everywhere
08:20
and pass them around and as we are talking about fully dynamic language, that will be accepted by all the function that you need to call as far as they have the properties that they expect. So you reach nearly the same level of little objects
08:41
that you can achieve in other languages like JavaScript and so on. And another very cool feature I want to talk to you about is that the fact that Python provides the full temporary files. This is a kind of temporary file that not frequently I see used in code,
09:02
but it can be very convenient for some reasons. I think that all of you at least once in their life, they need to store some data in memory in the form of a file or read it or something like that. And frequently we end up using the bytes.io as the way to store this data, to keep this data around.
09:24
The problem with bytes.io is that it grows as far as your memory allows it. So in some contexts where you have constrained resources or where we have multiple processes running for the same resources, this can be a problem because let's say that you are storing in a bytes.io
09:43
a file that was uploaded by a user to your web application. If the user uploads a 20 gigabytes file and it's able to actually send it to you, you are going to consume 20 gigabytes of memory. And that's going to be a problem if your system doesn't have 20 gigabytes of memory
10:00
or if your system has other processes that compete for the same memory. And the solution to this problem is usually to rely on disk file, like temporary files. Most web frameworks end up storing, when you submit a file, they actually store it
10:20
as a temporary file on disk somewhere and they return you an access to that temporary file. And this is cool because it solved the memory consumption problem. The disk is really far bigger than the memory you have available, so it shouldn't be a major problem, but has the problem that it has a major performance impact
10:40
the disk is far slower than your memory. And in some cases, you can be also throttled by the maximum number of IEO operation every second you can do and things like that, like especially if you are on a shell file system, like elastic file system or things like that. They have a maximum number of IEO operations
11:01
that you can do across the whole cluster. So if you have many nodes that are writing and reading on the same file system, you're probably going to hit the IEO limit before you hit the bandwidth limit. The solution to this problem is actually available in the standard library and prevents you
11:21
from facing these kind of problems. That this is actually real logs from an application that has been continuously killed, a web application, you see Apache is usually the process that got picked by the wound killer, just because the user was uploading very big files in the intent to bring down the system on purpose.
11:43
And it was a success in doing that. But if you use this full temporary file, you are going to get the best of the both worlds. What's going to happen is that as far as the file is smaller than the specified max size, Python is going to work on it in memory.
12:02
So it's going to practice to use something like bytes.io. Everything happens in memory, it's very fast, you can work it on nearly real time and things like that. But if your files grows further at certain sites, it gets swapped to disk. So you don't have to care about your users
12:22
exhausting all your memory or things like that. And this is probably the best solution to that kind of problem, because for realistic sites, for those that will build 90% of your requests, you are going to process them quickly in memory. And for big requests that might be malicious
12:42
or might be an exception, you are going just to be a bit slower because you offload them to the disk, but you're not going to crash anymore. Another interesting problem that is frequently in some kind of application is proper output alignment and display.
13:01
If you ever brought a terminal application, you probably face the case where it takes you a few hours to provide the output you want to provide, a proof of concept of your web application, and you spend the rest of the days trying to format it in a way that works in a reasonable way on all sites and kinds of terminals.
13:24
However, your users might have decided to resize their window and things like that. That's a very hard problem that not frequently many software out there that has been around for years are not very good at solving. And to show you this case, I picked a very common example
13:44
which is a progress bar. Progress bars are very widespread. They are a very recognized and known way to show to your users the progress of some kind of operation. But they are also very good example for this case because if you don't format properly a progress bar,
14:02
your user is not going to understand anything. It would be just a huge mess of things on its screen. And we lose any ability to provide the information that you want. For example, this is the progress bar from a real software existing out there. And you can see that at a certain point,
14:23
it just gets a huge mess of truncated data and one point it goes on a new line. It's really hard to understand what's going on. Is it progressing at all? At which progress am I right now and things like that. So this is why you want to make sure
14:42
that your software is properly able to adapt to the screen size of your users. And the standard library actually has a way to do that. And I use that utility to implement the progress bar that adapts properly to the size of the screen.
15:00
And here you can see the example. It's very simple to use. You just decorate the function that has to report the progress and yield any progress advancement from that function. So that function can compute whatever it has to compute, write the result whenever it has to write it and meanwhile it can report progress by yielding values back to the decorator.
15:25
And the output, the result is, as you can see, the nicely formatted progress bar in theory. The implementation is pretty straightforward. It's just a function that loops and continuously consumes data from your decorated function.
15:40
Your decorated function there is the, it's called to retrieve the generator. You see gem equal func. And then it just goes on calling forever next over your generator. So the implementation is not the interesting part here. The interesting part here is that in the shutil method,
16:01
in the shutil module, sorry, you have a way to grab the, where is it? I can't find it anymore. Here, sorry, at the top. You have a way to grab the current terminal site. So you are able to adapt the output of your function
16:21
to the current sites of the terminal window. And this will take care for you of most of the complexity because it's not something very simple. It depends on the terminal configuration kind of system you are on. On Windows it's different than on Unix systems and things like that. So it's not a very straightforward information to achieve
16:43
but the standard library takes care of making sure you get back something meaningful for you. And even better, if I move this information within the while loop, it will also adapt if the user recites this on real time the windows. So I don't have to care about it.
17:04
Let's see the next feature available in the standard library. How to survive an atomic fallout. Not really. Actually it's how to survive atomic writes. So in many web applications there is a need to
17:22
replace the content of a file of some data that you are storing on disk. Suppose like an uploaded image or something like that. Only if you successfully wrote it for real. If you completed writing it. If your process gets killed in the middle of the write, if your process crashes in the middle of the write
17:42
or things like that, we should have not corrupted the file we were replacing. This is actually a very common problem for nearly all applications that have to store data like. Think of your database system. Would you be happy if you were inserting data and if it crashes in the middle of insertion
18:02
you corrupted the whole content of the database? That's probably not something that's going to make you very happy. But most of the application I saw around was after from this bug, from this issue, that if you interrupt the write that will have corrupted the data because the way the system writes the data is by writing in chunks of bytes.
18:23
So if you interrupt in the middle of two chunks you only wrote the first part of the data and probably you end up with a file that has half of the previous content and half of the new content and makes no sense at all. Or it might have only half of the new content so it still truncated invalid data.
18:42
And we can implement using the standard library a utility that makes you safe from this kind of issue. And it will also allow us to retain the same exact behavior we are used to when working with the plain open function. We are just going to use safe open instead of open
19:03
and then we can write and read from that file without caring about the fact that it will guarantee for us the atomic write to the file. So if we use the safe open in case our codes crash in the middle of writing
19:21
it will roll back to the previous state of the file and the implementation is very straightforward. We are going again to rely on a very convenient model of the Panto external library which is the temporary file module. The temporary file module is a kind of temporary file which is the name of the temporary file. It means that you don't only get back the handle
19:42
that gives you access to that temporary file but gives you back an actual file name on disk. So something on which we can call more functions based on the file name itself. And why is that important? Because by virtue of the fact that we are within the context manager
20:02
when you open a file we actually don't open the file but we open a temporary file. So we are going to write to a new file and not to the one you were targeting to. So in this example you see that I opened TMP my file and what I'm going to do is that instead of opening
20:22
TMP my file I'm actually going to open a temporary file somewhere on the disk. Then my code works on that temporary file and only if it succeeds the temporary file is closed
20:41
and then the file is renamed so it's moved to the place that was previously used by the file we wanted to open. And this works because as far as you are on the same file system so you are not moving across to different disk or a remote file system or things like that. As far as you are on the same file system
21:01
the rename operation will be atomic. There is no way to interrupt halfway a rename. It's or I renamed it or not. So at the end I wrote the whole content on my file or not because I was able to end the whole function and rename my temporary file to the target file
21:21
and so I replaced the target file with the data that I wrote on the temporary file. Or I failed doing that and so I didn't rename anything. My temporary file is left there and my target file is still there as it was before. Doesn't even know that something was written somewhere.
21:44
Instead if I get an exception I just throw away the temporary file and nothing was ever written. My target file is still in the same state it was before. And we can see how this works with this simple example.
22:04
You see that in the first case I obviously just open the file and write the content and if I read it back I will see hello world because I wrote everything and closed my context manager. While in the second example I open my file
22:21
I write some content and I throw an exception which will cause my code to exit from the context manager. Now if I bring back the content on my file what do you expect to see?
22:50
It's important to notice that in this example I used the standard open so I didn't use my safe open. And for this reason what I'm going to see
23:01
is only the first line of the content. My previously written hello world is thrown away and it gets replaced with my replace the hello world but I never add the second part of the text written on this which is what you expect. It's very straightforward to understand its content. The problem is that I thrown away
23:21
the previous content of the file. It's lost forever but I didn't finish writing the new content that I wanted to write. While instead if we use the safe open for the same exact example if I read back the content of the file after the crash of my code base I won't read replace the hello world
23:42
but I will actually read hello world. Nothing was ever written to my file by virtue of how the safe open context manager works. So I didn't corrupt my previous data in any way. I didn't write anything and everything is as it was before. It's like having a transaction system for files.
24:01
If you fail you can roll back the content of the file to the previous state and be happy. One more feature that's available in a standard library and very few people know about is the multiprocessing manager.
24:21
Some people probably used it in the past too and ended up relying on it to share data across multiple processes. It's very common. It's the reason why it was born. It's a way to share data across processes. You can store any value inside the manager
24:43
and that value will be available to all the processes that share the same manager. So you can write multiprocessing software without having to manually share the data across the processes. Any data that you store in the manager will be available, readable from all the processes that are children of the process that for the manager
25:03
and they can replace the data and things like that. It's like having a database where you can store the data you want to share across all your systems. By the interesting feature is that very few people know how the manager actually works. And the way it works is that it actually forks
25:22
a sub-process of your process and this sub-process is a fully functional memory database system. The list stands for request through TCP. So the point is that from any process on any machine, you would be able to connect to that process to the manager and store data there.
25:43
You don't even need to be a child of the process that for the manager. As far as you know which port and address is listening, you can write and read data to that manager. So what we actually have is a fully functional database system without having to start
26:02
or manage any database system. For very simple projects, this can become a convenient way to store the data and permit access across different processes from different machines. And it's easy to see that if you pair the manager with the shelf module, you actually implemented
26:21
the fully functional database system because you can shelf on disk all the data that is in the manager and you can restore it from disk the next time the manager starts. So let's see how it works. In this case, I'm going for convenience to just rely on the fact that I forked the manager
26:41
so that I don't have to show you what the machinery involved into connecting to a remote manager because the example is already pretty long like this and didn't fit in a single slide. So let's suppose we have a first process that writes, that sets 42 as a value in the manager
27:02
and the second process that sets a dict as the second value in the manager and the last process that sets a daytime as the value in the manager. Those three variables, first, second and last, will be available to a fourth process
27:21
that is able to read their content and print them. As far as that process knows about the manager, so as far as they were forked all from the same parent, they can read and write to the same manager and the data is available to all the processes that are sharing that same manager.
27:43
The only thing you need to make sure is that you don't fork the manager in the middle of your code because the way it works is as I told you, it makes a fork of your project so you don't want your code dynamically forking processes, copying the whole memory of your process including all the data that you would need
28:01
storing that process because you probably know that forking relies on the copy-on-write parity but you probably don't remember that Python has built-in reference counting so anytime you assess an object, you are actually going to break the copy-on-write because you have to increment the reference counting in the object. So forking and copy-on-writing Python
28:22
are not very efficient as you would expect. And here there is the example that I told you where you can actually assess the namespace from any machine on any system as far as they are connected to a network. You can see that I can create a manager
28:43
and tell the manager that it's going to listen to the part 50,000. I specify it just because that way I know which part it's listening to, otherwise it's something dynamic that I will never know. And then I tell to the manager to serve connections forever.
29:02
So I just start a process that stays there, blocks and listen for requests and goes on and on serving any request that he receives. And from a second process, I make another manager that instead of listening for request,
29:20
connects to the other manager. So I can tell to the second manager to connect to the other manager. And the interesting part is that if I set the value in the first manager, like in this case I set five in value, the other process it's perfectly able to print that value.
29:40
Even though that value was stored on a different machine, doesn't even have to be on the same machine because we are relying on a network connection. You just need to pay attention to the fact that the modules, if you store complex objects like classes or things like that, those modules need to be available on both systems because otherwise
30:02
we will face peaking issues of course. But that's the only real thing you need to care about as far as you have the same environment on both machines, you are able to send values back and forth across the whole network without having to care too much. One more really convenient feature
30:21
that I think that many of you have faced at least once in your life, is that if you have to write HTML output, maybe a webpage, maybe send an email, you know that you need to escape anything that you place in that email and that the user provided to you. Because otherwise if the user provided malicious content
30:43
like JavaScript or similar, you are going to send to the browser of your users that JavaScript is and it will be executed and it can be a very powerful vector of attacks. This is still a very common issue in web applications,
31:01
like 90% of the security reports that you can find for websites out there are usually through this vector. Which means that this is something we developers are not very good at handling. And the most common way to handle this problem is by doing manual escapes on every single page
31:20
where you inject the value in the output. And that's not something that works because it gets boring and tedious and humans are not very good at doing boring things. They start to skipping them saying, oh no, here we will never append anything and things like that or they forget and so on. So the best approach to this problem
31:40
is actually having escaping by default. Have the software escape everything unless you explicitly don't want to escape it. So you don't have to remember to escape things and in worst case you wish something that is broken to the user but that's not malicious in any way. And the way we can achieve this
32:01
is through the standard library formatter. If you know the way you have the format strings in Python as being based on the string formatter, actually it's not really implemented on top of the string formatter but they provide the same exact feature
32:22
but the string formatter is something that we can customize. So it provides the string formatting feature through a class that we can customize. In this example I made an HTML formatter that allows me to format HTML and why is that important? Because every time I replace a value in my target string
32:43
it actually gets escaped for me. So if you see in the output, the strong name was escaped so I didn't put HTML in the result. I properly escaped that. But the title where I explicitly stated that it's markup so I want to have it as markup was not escape,
33:03
was formatted as is. There are very cool libraries out there that take off these. This is the most famous one. I think it's markup safe from the Flask guys but it's something that you can easily achieve without the need of any external dependency
33:22
or any web framework at all. It's available on the standard library itself just by subclassing the string formatter. And the way the string formatter works is that every time it has to replace a value, every time it has to format a value, it's going to call the getfield method.
33:41
It's going to tell the getfield method, hey, I need the value of this variable. Like in this case, we had the name and title so every time the formatter has to resolve name and every time it has to resolve title it's going to call the getfield method. And what we can do is that after we actually retrieved
34:01
that value from the class, from the parent class, we can check if that value is a markup value and the protocol on which the markup class is based is that it has a underscore underscore HTML method. So if our class, if our value is a markup based class
34:23
or respects the markup protocol, we are going to put it as is or more than as is formatted as HTML. So call the protocol. While instead if our value is not markup, we are going to escape it. If it's a string, let's escape it.
34:42
That way every time we format the value, it gets escaped automatically and we don't have to remember to escape it. And here you can see a very simple implementation of the markup protocol. It just relies on having a dunder HTML method that returns the HTML representation of that object,
35:03
which in case of a string, it's the string itself. So it's very simple. It's just a way to tag a string as something that has to be formatted as is instead of being escaped. So you can see that in five lines of code, you actually end up having the same exact feature
35:22
that you will have by installing the external dependency. And it's a feature that can save your life in many contexts. I think that most of you already know this comic stripe, but this is very funny.
35:41
One last thing I want to show you, actually two last things I want to show you, but this is one of my favorite as I told you. I'm involved in web development and I'm maintaining web framework. So it's very fun for me to talk about how to write a web framework. And it's something that we can do in five minutes through the standard library, because if you look very well into the standard library,
36:03
you will see that you have all the pieces that you need to write a fully functional web application that take care of the most complex part of handling web requests for you. So this is the web framework that I wrote only relying on the standard library.
36:20
And it might look like something that you already use in your life. It's very similar. We just declare a whiskey application and then attach all the functions that we want to routes that get served for us. And the routes can be even complex, like having regular expression arguments and things like that. So any time I assess the route,
36:41
I will see a little word. Any time I assess link, every time you click link, try this other link, and then it gets me to an endpoint that accepts some arguments and I can bring back the arguments and things like that. So it's practically fully functional web framework for most basic needs.
37:00
And at the end, I just have my web application, so it will list them on HTTP for me. The two route concepts that I use as a convenience, you don't really need to do that, but it can be convenient to represent the incoming data and the outgoing data to response and request classes. That's what you're probably used to
37:20
in other web frameworks. And the response class only has to take care of remembering what's the status that you want to take to send back and what's the address that you want to send back. And then provide a simple method to send back that response. Why the request is going to keep in memory all the information that got sent to us by the browser
37:43
and allow us to read what was the requested path, what was the requested arguments and things like that. The web framework itself, it's just this code. Nothing more, nothing less. So it's very simple. The root decorator, which is the one we used here
38:03
to register the various endpoints, just saves the endpoints you wanted to register into a list of available endpoints. So it keeps somewhere recorded that for this URL we want to call this method, nothing else. Of course, as the endpoint can be a regular expression,
38:22
we compile that regular expression. The same method, it's even simpler because it relies on something that it's really made in the standard library, the WSGI reference server. So we create a web server, a WSGI server that we list them on part 8,000.
38:41
And every single request that gets received on that part is forwarded to self, so to your web application. And it's forwarded by calling the provided callable object, so in this case by calling our dunder call method.
39:02
So every time we receive a request, we get the dunder call method's call. And what we do is that we create the request representation, we check if any of the registered routes matches with the request path that we got, and then we call it.
39:23
So we just call the register endpoint, we get back the response that the endpoint provided us, and we send back the response to the browser. That's all. So it's very straightforward. Look if there is any registered callback for the endpoint, call the callback, get the result from the callback,
39:42
and send back the result to the browser. That's all you need to implement a fully functional web framework, I will say. While it's true that I lied, you will notice that probably if you start doing these in two weeks, you are still writing a web framework. So in three months, it will be two million lines of code
40:01
and things like that. And in three years, you will be the maintainer of one more web framework in Python. So don't do that. Use Django, Flask, or whatever you want. But it can be convenient if you only want to quickly expose two free endpoints in a web application that maybe has to run in a constrained environment where you don't have enough memory space
40:21
or a properly functional package manager to install dependencies or things like that. Or you just don't want to bother maintaining dependencies and things like that. Five lines of code and you are happy with your own web support. The last recipe I want to show you is based on code tracing.
40:41
This is not something that you are going to use in your code itself, but it's something that can be convenient to understand other people's code. So whenever you use a new library or whenever you have to approach a new code base, it can be very hard to understand what's going on. You know nothing about the library or that code base,
41:03
so it takes time before you start understanding how it works at the general level, have a vision of how the library works. And in my experience, the best way to start understanding that, start having that vision, is by following what goes on when you run the library.
41:21
So look at which lines of code get executed, look at which arguments get passed around, follow the execution of the library or of the application you are trying to understand. The problem is that when you try to do this with debuggers, they focus on the specific line you are executing, and after a while,
41:42
you lose the global vision of what's going on. After the third time, you end up deep into the stack of functional codes. You don't even remember anymore where you came from. You have all the information you care about about the single current line of code. You have the context of the code surrounding the line of code where you are,
42:01
but you probably forgot where you came from. So it's not very good to have a global vision of what's going on into the application as a wall. And a very good way to have that global vision is to rely on tracing of the wall code.
42:20
So this is a very simple example that I did, and I asked to the tracer to trace the execution of function. And what's function? Function is a very simple function, and just does the sum of two numbers. And as you can see, I get back the printed code, and I get a little plus near to all the lines
42:42
that were executed. So you can see that in this execution, I didn't enter this if, and so I didn't print this message. I just got a one, b two, and return a plus b. And so I understood what's going on.
43:02
Why didn't I print that message? Because it never passed the if, and things like that. Of course, in this very simple example, it's obvious to understand, but it's already a good way to understand what your code bed contains, and it can surprise you, like this food that contains human flesh. I'm not sure I'm going to try it.
43:22
How is that all implemented? It's based on the trace model. So inside Python itself, inside the external library itself, you have the trace model, which allows you to trace the execution of code base. And what we do is just every time the tracer provides us information, we record that information somewhere,
43:41
and then to the print trace execution, we print the code base with the information of which lines got executed. So it's very straightforward to understand. You just record which line were traced, and then print those lines back. So you can see all modules that got executed in which order, which lines were executed,
44:01
and try to understand which is the full flow of a function code that you did. That was the last recipe that I wanted to share with you. If you want to see more recipes, of course, go and read the book. And thank you for your attention.