We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The Standard Library Tour

00:00

Formal Metadata

Title
The Standard Library Tour
Title of Series
Number of Parts
141
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Are you tired of writing complicated code only to discover that Python has tools in its standard library that could have made your life easier? Join us for a tour of the standard library where we'll dive into less-known modules that do well-known things and well-known modules that do less-known things. This talk is tailored to beginners or anyone who wants to learn more about Python's standard library.
114
131
Library (computing)Standard deviationSoftware engineeringComputer animationLecture/Conference
SoftwareoutputCASE <Informatik>IntegerLine (geometry)String (computer science)ParsingDifferent (Kate Ryan album)Functional (mathematics)Parameter (computer programming)Library (computing)Statement (computer science)1 (number)Computer fileCodeRight angleModule (mathematics)3 (number)Default (computer science)NumberLogicProduct (business)Core dumpQuicksortStandard deviationTemplate (C++)Error messageAutomatic differentiationResultantLocal ringArc (geometry)Software developerType theoryWeb pageSoftware testingFront and back endsMathematical analysisPartial derivativeCartesian coordinate systemMultiplication signMixed realityPhysical systemArithmetic meanDimensional analysisSoftware bugServer (computing)Expected valueLatent heatEndliche ModelltheorieData typeVariety (linguistics)Object (grammar)Software engineeringDisk read-and-write headComputational scienceSingle-precision floating-point formatInformation overloadQuarkWritingComputer animation
Partial derivativeCodeFunction (mathematics)Parameter (computer programming)MultiplicationAuthenticationParameter (computer programming)Multiplication signEmailLogicCASE <Informatik>Functional (mathematics)System callSoftware developerCartesian productData streamResultantCache (computing)Operator (mathematics)Element (mathematics)IPSecQuicksortComputer configurationStandard deviationElectronic mailing listPartial derivativeProduct (business)ProgrammschleifeWrapper (data mining)Key (cryptography)Module (mathematics)Revision controlSinc functionLibrary (computing)Data storage devicePredicate (grammar)Different (Kate Ryan album)Disk read-and-write headGroup actionInformationData dictionaryChainLevel (video gaming)Virtual machineFocus (optics)Pairwise comparisonMaxima and minimaCodeBitIterationError messageView (database)Social classObject (grammar)BuildingDynamical systemSingle-precision floating-point format1 (number)AverageFormal languageMessage passingComputer animation
Lambda calculusCASE <Informatik>ResultantRevision controlDifferent (Kate Ryan album)Matching (graph theory)Object (grammar)SequenceFunctional (mathematics)Directory serviceContext awarenessProjective planeData managementElement (mathematics)Similarity (geometry)AlgorithmElectronic mailing listString (computer science)Statement (computer science)Clique-widthComputer fileLine (geometry)Module (mathematics)Operator (mathematics)Web applicationMultiplicationLibrary (computing)Content (media)File formatEndliche ModelltheorieKey (cryptography)Factory (trading post)Default (computer science)Traffic reportingData dictionaryData structureCustomer relationship managementMultiplication signSoftware testingConnected spaceSoftware frameworkSoftwareInternetworkingSet (mathematics)Materialization (paranormal)MereologyBasis <Mathematik>Open setBlock (periodic table)CodeScripting languageDampingEnterprise architectureAsynchronous Transfer ModeError messageInstance (computer science)Social classType theoryPairwise comparisonSynchronizationComputer animation
Multiplication signRevision controlModule (mathematics)WindowPhysical systemArithmetic meanPresentation of a groupLecture/ConferenceComputer animation
Transcript: English(auto-generated)
Hello, everyone. I'm very excited to be here. Such a great conference. My name is Mia. And I would like to welcome you all to the Standard Library Tour. First, let me tell you something about myself. Oh. So, I'm a software engineer working at Atacama.
Atacama is a Canadian company that specializes in creating data products. I have over five years of experience in the IT industry. I try different stuff like tech support, testing,
analysis and back-end development. I'm based here in Prague. And I am a co-organizer of our local PyCon conference. All right. So, let me ask you at the beginning,
how many of you have ever heard of Python Standard Library? Raise your hand. Okay. Everyone. How many of you have ever used it? Okay. Is there anyone that hasn't used it? No. Okay. So, it looks like all of you are familiar with this. But just to ensure that we are
on the same page, let me just quickly mention what is the Standard Library. So, the Standard Library is a collection of modules and functions that are included with Python. It offers numerous functionalities. And these functionalities include things like interacting with operating system, running servers, scientific computing, debugging,
data manipulation and many more. Now, let's talk about why. So, you might be wondering, why should you know anything about a library? And why should you use it instead of, for example, implementing the features by yourself? So, by using the Standard Library, you're not
reinventing the wheel when it comes to finding solutions. The solutions that are there, they've already been optimized. And by using these solutions, you can have a way to encounter a lot of bugs that have already been fixed. Now, just to manage the expectations,
let me just mention what is this talk about. So, this is a brief overview of lesser known features of the Standard Library. And the aim is to discover the unknown unknowns. So, these are the features that you probably didn't know that they exist. So, the library
is huge. So, I started thinking what exactly to cover because we don't have time to cover everything. So, I started thinking about which modules are well known. Meaning that most of you have probably used it. And which are less known that are not much used. And also, I started thinking about things that we do as developers like every day.
So, some things all of us do every day. But some of them we rarely do. So, I will not talk about less known modules which do some less known things. Because this is a topic for some small audience that does something like very specialized. And also, I don't
want to talk about like top 5 functionalities from the Standard Library. Because if you just Google top 5 functionalities, you would probably find it there. Instead, I would like to focus on the other two quadrants. So, let's start with the well known modules which do less known things. So, let's see what's hidden in the modules which probably
all of you have used. So, just a small disclaimer that all of the code examples that I will show, they're just illustrative. The specific use case when to use the Standard Library or not, of course, depends on your specific use case. So, let's start with Fung
tools. Raise your head if you have ever used Fung tools. Okay. So, looks like most of people. So, it is one of the most frequently used modules inside the library. And it includes a variety of tools working with functions. So, stuff like modifying function behavior or creating function like objects. So, let's start with an example. So, this
is an easy function which adds two integers together. But this function should handle two cases. First case is when both of values are integers. And the second case is when they are strings. So, when they're integers, the function should add them together. But
when they're strings, it should concatenate them with an empty space in between. So, you could write something like this. And if we try to run it with different values, we see that it works. In the first case, it added it. In the second case, it prints us hello world. In the third case, it says unsupported data type which is as expected.
However, there might be a better approach on how to solve this problem by using the Standard Library. So, here we have a single dispatch decorator above the function add. You can see it on the line 4. And then we specify the behavior for each type in a
separate function. So, if user enters integers, the functions defined on the line 10 is run. And that one adds two integers together. But if user enters strings, then the function on the line 15 is run. And for all other cases, if user adds enters any other data
type, then the value error which is defined on the line 6 is raised. So, single dispatch, it's used for function overloading. Just to mention, function overloading means creating several methods which is named time where they're different from each other in the type
of input parameters and number of input parameters. And it's commonly used when you work with different data types as input to your functions. For example, you're parsing input data from some files and you have different data types. Now, you might be wondering
why the approach with using single dispatch might be better than the first example where we just had a lot of AFL statements. So, the advantage is that it's easier to modify. It's because each function is independent. So, you can just modify one function. For
example, only the one that adds integers and you don't have to modify other ones. And also, the code is cleaner and more readable. However, it dispatches only for the first arguments. So, the downside is that if you have multiple arguments, in that case, you cannot use single dispatch, but you would have to use some third party
libraries. Okay. So, let's see some more examples from the functools module. So, let's say we have a function which adds two numbers together. But very often, we add only by two. So, you might write something like this where you basically define the default parameter. But now let's say we suddenly start very often adding by three.
And you add only by two or three. So, in this case, you might go like this. So, you just create another function where you have a default parameter three. But basically, what you're doing is you're duplicating the code, right? Because they're kind of the same. So, instead, what you could do,
you could use partial. So, in this case, we define a function called add, which is on the line four. And it acts as some sort of a template for other functions. Which are in our case the function add to two and add to three.
So, we use partial to pass add into these functions where we define an argument in each case. And in this case, the core logic is kept in one place, which is our function add on the line four. And you can have multiple partial functions that are able to reuse that code. So, if we run it, we can see that it returns the same result.
So, partial is used to create a new function with some sort of the argument of the original function. So, it's some sort of a template, we could say. It can be used with any callable, including built in functions, methods from other libraries, arcs and quarks.
So, now you might be wondering what is the advantage or why you use it instead of just using the default parameters and duplicating the code. So, the main advantage is code reusability. So, you're adhering to the dry principle. You're not repeating yourself.
And a common case is when you need a call when you need to call a function with the same argument multiple times. So, one example from work. So, imagine you have a function where you call various API endpoints and there is an authentication header which you must pass each time when you make a call. So, if you set up its value or refer to it at each place
where you call the API, and the authentication suddenly changes, what would happen is that you would have to update all of these places, right? But if you use partial, you can just centralize this logic in a way that you have it at one place and where you define
the header and then you just reuse it at all places where basically you just call your various APIs. But it has one downside. And that is it's not really intuitive to new developers. So, for example, if your team is very junior or they come from different they have different
backgrounds, they use other languages, for example, which don't have partial functions, then that case maybe you would want to sacrifice reusability for better readability. Okay. So, let's see one more example from partials. So, we have this Fibonacci function. It is recursive. And it has some sort of expensive operations. So, is there any way we
can improve it and make it faster and more efficient by using the standard library? So, one option is we can use cache. So, there is the LRU cache decorator from functools. So, basically this decorator caches the result returned by the function. So, the function doesn't
need to execute the code each time. And it can just return the cache results instead when they are available. And from version 3.9, we can also use the cache decorator. So, least recently used decorator stores the results of function calls. It checks for the
key in the cache dictionary. When the key is present, the wrapper returns the value and updates the cache hit info. But if the key is missing, then the wrapper calls the function with best arguments and then it updates the cache misinfo and returns the result.
In case the cache is full, it evicts the old items and just replaces it with the new ones. And now you might be wondering, is there any difference between LRU cache and cache? So, cache is available from version 3.9. And it is the same as LRU cache max size none. So,
basically it doesn't have any max size. It doesn't evict the old values. So, it might be faster. So, I have benchmark, average, execution speed, with and without cache. Just to mention that this is not a precise method. So, if you run it in multiple machines and multiple times, you would probably get different results. But this is useful just for some like rough comparisons
and you can just like focus on the ratio here which are like typically consistent. So, here we have three examples. We have our Fibonacci function without caching, with LRU cache and cache only. And you can notice that the cache versions are noticeably faster. So,
the caching does speed up our code. And you can also notice that the performance of the LRU cache and cache only version basically are the same. All right. So, let's explore some more wellknown modules. Who here has ever used iter tools? Raise your hand. All right. A lot of
people. A bit less than functools. So, iter tools provide various functions to create iterators for efficient looping and they're commonly used when you're handling large data streams. So, let's start with an example. Let's say I need to calculate the Cartesian product of
two lists. So, here we have two lists and two nested for loops where we iterate over both lists. Instead of calculating by ourselves and just like iterating over two lists, we can use instead iter tools.product. And as you can see, it returns the same result. So, product returns
the Cartesian product of two iterables. Let's see some more examples. So, let's say I have a list and I want to filter elements that are in another list. So, I could use the builtin function filter, which does exactly that. And now you might be wondering, but what do we want
to do the opposite? Like, I want to filter elements which are not in the list. So, something like not filter. So, we can call iter tools to the rescue. We can use filter false, which is exactly the opposite of the builtin filter. So, filter false, it filters elements from an
interval returning only those for which the predicate is false. And basically, it's the opposite of the builtin filter functions. Let's see some more examples with Python builtin functions. So, we have two lists and we would like to combine them together. You can also
notice that the first contains three elements. And the second one contains six of them. So, what does happen when we run this code? When we run this code, we can see that only the first three elements are merged together. Why? Well, because it doesn't take the
elements for which there is no value in the first list. So, basically, it just discards them. But what if we want to zip all of the elements and we want to for those that don't have values in the first list. For example, we want to fill in with some other value like none or something else. In that case, we could use the zip longest function.
Basically, how does it work? It fills the value for the missing elements and we can also set up our own value. In this example, I set up here none. But we can use any other value. Okay. So, so far we have talked about fun tools and iter tools. And I would like to talk
about one more well known module which is also commonly used and has some interesting hidden features. How many of you have ever used collections? Raise your head. Okay. A lot of people. But still less. So, let's say I have multiple dictionaries and I need to find some key. So, we could do something like this. But what happens if, for example,
I have ten dictionaries and not only two? Then we would have a large if else statement. And while this approach is functional, it might not be the best approach. So, let's say we can improve it with a standard library. So, we can use chain map.
Here we add all dictionaries to the chain map class. And it groups all dictionaries together. So, basically when you're searching for a key in the chain map, it will search for it in all group dictionaries. And it will return you the first result it finds. Now you might be wondering how about creating a new dictionary and just using the update method
to add all the data. In this case we could also group all dictionaries into one and then iterate over it. Well, chain map works in a way that it references already existing dictionaries and doesn't copy any data while the update method does.
So, the chain map groups multiple dictionaries into one and it provides a single dynamic view. So, what happens is that when one of the dictionaries gets updated, the update is visible in chain map as well. Speaking of dictionaries, what happens when we attempt to search for a key that doesn't exist yet?
The key error happens, right? So, what do we want to prevent if we don't want it to appear? Because, for example, we don't have the key there yet, but we will have it. Is there anything we could use from the collections module? So, it happens that collections
have a subclass of the dictionary class that returns a dictionary-like object which is defaulted. The difference is, however, that we can use it with non-existing keys. And it doesn't return the key error. Instead, it supplies the missing values with a value that was passed. In this case, we instructed to supply the missing values with a none value.
So, if we run it and try to access value for that key that doesn't exist in the dictionary, then we can see that it results, returns us none. So, defaulted, it is a container-like dictionary. It returns a default value for a non-existing key. It is commonly used for
grouping or for counting elements in a collection. And it can make your code simpler, more readable in case that you don't need to check if the key already exists or not. And you might be wondering, but what if my structure is nested dictionaries? Can I still use
default dict? Yes, you can. You can have multiple nested dicts with a default dict as well. Okay. Now, let's move to the second part. So, what are some less known modules which do well-known things? Some things which we do on a daily basis for which there are tools in the
standard library and knowing about them might be useful for us. So, testing. I will not ask how many of you write this because I hope that all of you do. There are certain frameworks for testing APIs or web applications. You can find a lot of materials on the internet about testing,
Pytest and many other frameworks. But I would like to focus instead on the less explored topic and that's kind of quick and dirty kind of testing without using any frameworks. So, let's say you have some small script. It's not it's not an enterprise solution. It's not big API,
just some small script on your PC running just calculate something. And you want to have some kind of test just to make sure when you change it that it still works. And ideally, also some documentation. So, is there any way how we could add some tests and documentation in a really easy way? So, I discovered the doc test module. It works in a way that
you can add examples in the doc string and you can notice that these examples are both tests and documentation for your script. So, if we run it, we see that nothing happens. But this is normal because there are no errors in our examples. But if we change something,
for example, the line 10, we have changed it from minus 15 to 15, which is a mistake. And when we run it again, we can see that our test failed. Expected 15, got minus 15. So, test mode is used for very simple scenarios. It's mostly for quick and dirty kind
of testing and documenting very simple scenarios. These test cases are readable to humans. It allows you to test and document your code in the same time. Comparing things. It's also something we very often do. And it happens that in the standard library there's a lot of
tools that we can use to compare things. So, let's see some examples. Sequences. So, here we have two strings. And let's say we want to compare if they're same or not. And if they're not, we want to determine how different they are. So, to compare sequences, we can use the JFlib module. So, here we create an instance of the differ class
and we compare two strings. And when we pretty print the result, we can see that all the characters are the same, except the one on the line 8, which is missing in the second sequence. And one on the line 9, which appears to be in the second sequence, but is not in the first one.
And we can also calculate the ratio of comparison between these two sequences. So, basically that means how similar or how different these two sequences are. So, here we can see result, which is 91%. That means they're very similar. So, JFlib, it's used for comparing pair of sequences of any type, string, top post, list,
as long as the sequence elements are hashable. And it uses the Radcliffe-Obershef algorithm. If you're interested in that, you can read more about it. And we can also measure the similarity of the sequences, where there are values between zero, no match, and one,
which is identical match, where the ratio over 0.6 means that the sequences are closed matches. So, these were sequences. And how about other things we very often work with? For example, can we compare contents of two files? So, it happens that there is a model for that in the
standard library as well. Here we have two files. And we can call the CMP function from the file CMP module. And when we print the result, it returns true. That means basically that the file's type, size, modification time, and content are identical. So, the files are equal.
And that was file. But you might be wondering, how about directories? What if I have, for example, two directories I accidentally cloned and I'm not sure if they're the same or not? So, for directories, there's a method that compares to directories. When we call the report method, we can see the difference. So, in this example, we can see that the file
1.py is only in the first directory. But the file 2.py is in the second one. So, the file CMP defines functions to compare files. The CMP class constructs a new directory comparison object to compare directories. And there are multiple functions to define
what to compare and how to show you the results. And it's very useful for cases where we have different versions of the same project. And the last great feature from the standard library I would like to talk about is a context manager. So, let's see an example.
Let's say we have a file which we need to open. We want to do something with it. And then at the end, of course, we need to close it. So, for this, we could use the context manager decorator where we write a function where we define our operation that is supposed to happen at the very beginning before we use it. And at the end, after we have finished our work.
And then we can use the with statement which is on the line 15 to use it. So, if we run this one, we can see that first we opened the file. Then we wrote to the file which was an operation inside the with statement. So, that is the line 16. And at the end, we close the file.
So, context manager defines a function for with statement context. It provides a clean, easy to read way to manage resources that needs some setup and some kind of teardown. And you can ask multiple context managers with blocks to use them at once or in a single
statement by separating it with commas. For opening and closing files, there is the and releasing a lock. Working with network connections, temporary files, changing and restoring global settings. And now you might be wondering, but what if my function is async?
Does this work in the async context as well? The answer is, it doesn't. But there is another decorated for it in the library as well. So, if your function is async, then you can use the context manager. Here is basically the same example, but only async. And here is our main
function. So, basically it behaves exactly the same, but the only difference is that one is intended to be used in the sync context while the second one is the async. All right. So, we have covered the second part where we talked about less known modules which can help us with
everyday things we do. And at the end, I have two takeaways I would like you to take out of this talk. You will probably not remember all of these examples and all of these details, which is perfectly fine. But I would like you to remember the following two things. The standard library is packed with very powerful tools and always verify if there is an existing
tool for your task and if it suits your need before attempting to create something from scratch. I hope you have enjoyed the talk and that it has inspired you to embark on your own journey of exploring the standard library. Thank you very much for your attention.
I will take my privilege to ask my question. If you have more time for this presentation,
which other modules or things you would like to share? Yes. So, my first version was my first version was 40 minutes. So, I had to cut it. I really like the module paths because there's a lot of useful tools which we can use there.
And the good thing is all of them are operating system agnostic, meaning it doesn't matter if you write something for Linux or Windows or Mac, but you can use it anywhere. Okay. Thank you. Let's give an applause again for Mia for this nice presentation. Thank you.