We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Tools for maintaining an open source Python project

00:00

Formal Metadata

Title
Tools for maintaining an open source Python project
Subtitle
A walkthrough of some great tools I use for developing, testing, maintaining and managing projects
Title of Series
Number of Parts
130
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
There is a wealth of amazing tools freely available to open source software developers to use to maintain their projects. Practically every problem we face or are likely to face in software development has been lived through and solved many times over. Patterns emerge for dealing with development practises and we build our software using the tools made by the previous generation of developers. We're very fortunate to operate in an amazing open source ecosystem where we've learned we're stronger when we build on each other's foundations. We're constantly laying new foundations for ourselves and we share our ways of working with the world. We now have a very sophisticated set of tools for developing, managing, testing and documenting our new projects without reinventing the wheel. But we don't discover all the tools at once - we pick them up as we go along, as we find uses for them and we hear about them. I'm going to share a range of great tools I use to maintain some popular open source Python projects, and explain how the difference they've made. The talk covers: - Software packaging and distribution - Licensing - Virtual environments - Software testing - Continuous integration - Git & GitHub - Contributor community - Project management tools - Documentation tools The talk demonstrates examples for development of Python projects on Linux, but the tools can be used cross-language and cross-platform.
61
Thumbnail
26:38
95
106
TouchscreenOpen sourceShared memorySoftware testingMeeting/Interview
Window functionSoftwareModule (mathematics)Virtual realitySoftware testingProgrammer (hardware)Library (computing)Physical computingBuildingRepository (publishing)Computer hardwareSelf-organizationTouchscreenSlide rulePerfect groupDistribution (mathematics)Self-organizationProper mapRepository (publishing)CollaborationismMultiplicationContent (media)MereologySoftware engineeringComputer programTwitterCodeLibrary (computing)Video gameBuildingAvatar (2009 film)Different (Kate Ryan album)Group actionOrder (biology)Revision controlBitWritingVisualization (computer graphics)Module (mathematics)Virtual realityCloud computingComputer hardwareBranch (computer science)CurvePoint (geometry)Shared memoryPhysical computingSoftware maintenanceSingle-precision floating-point formatArmComputer fileData miningMessage passingRepresentation (politics)SoftwareTraffic reportingSoftware bugSoftware testingOpen source1 (number)Data structureRight angleSoftware developerLevel (video gaming)Process (computing)Instance (computer science)Component-based software engineeringPosition operatorPiFrequencyDigital photographyPhysicalismMeeting/Interview
Window functionWhiteboardFormal languageWebsiteSoftwareStability theoryOpen sourceSource codeDistribution (mathematics)Module (mathematics)SoftwareOffice suiteMobile appPhysical systemWeb pageUniform resource locatorFunctional (mathematics)String (computer science)ImplementationDirectory serviceLibrary (computing)CodeDistribution (mathematics)Direction (geometry)Home pageDifferent (Kate Ryan album)Module (mathematics)BitComputer fileFile formatMoment (mathematics)Open sourceConfidence intervalRevision controlDescriptive statisticsInformationUsabilityWebsiteCloningMathematicsState of matterGoogolRepresentation (politics)AuthorizationLine (geometry)CASE <Informatik>Latent heatVisualization (computer graphics)Instance (computer science)WhiteboardNumberElectronic mailing listCommitment schemeQuicksortStability theoryInstallation artExterior algebraPoint (geometry)Matching (graph theory)Medical imagingExpected valueReal numberData managementCurvatureFormal languageRight angleXML
Module (mathematics)Window functionInstallation artPasswordAxiom of choicePoint (geometry)Virtual realityPhysical systemBuildingLocal ringCodeSoftware testingMaxima and minimaPersonal digital assistantDifferent (Kate Ryan album)Descriptive statisticsProcess (computing)Revision controlData dictionaryInstance (computer science)Web pageVideo game consoleIntegrated development environmentSoftware testingLibrary (computing)Line (geometry)Multiplication signMereologyFunctional (mathematics)Computer programPhysical systemImplementationBitInstallation artComputer fileVirtual realityCodeSource codeData structurePoint (geometry)Electronic mailing listHome pageType theoryScripting languageOptical disc driveString (computer science)WordMathematicsQuicksortSoftware bugGastropod shellPoisson-KlammerReal-time operating systemModule (mathematics)ExergiePasswordLink (knot theory)Axiom of choiceNumberSuite (music)NamespaceUniform resource locatorControl flowoutputCASE <Informatik>Goodness of fitUniformer RaumLevel (video gaming)Reading (process)Subject indexingTheory of relativityInternet service providerXML
Software testingCodeMaxima and minimaPersonal digital assistantWindow functionPatch (Unix)Video projectorExecution unitComputer virusRevision controlComputer programMereologySource codeGauge theorySoftware testingFunctional (mathematics)Library (computing)CASE <Informatik>Instance (computer science)Unit testingMultiplication signLine (geometry)Suite (music)1 (number)Data structureBranch (computer science)Network topologyMathematical analysisLevel (video gaming)Game controllerString (computer science)Error messageNumberSound effectObject (grammar)outputMessage passingComputer programSingle-precision floating-point formatInterface (computing)MereologyCodeCombinational logicRevision controlForm (programming)Task (computing)CurveSoftware developerTest-driven developmentMultiplicationWritingPatch (Unix)Exception handlingAttribute grammarDistribution (mathematics)System callWrapper (data mining)Context awarenessData managementInstallation artPrice indexComputer fileGoodness of fitConfiguration spaceActive contour modelSinc functionModule (mathematics)BitMaxima and minimaStandard deviationComputer virusVirtual machineSubsetDifferent (Kate Ryan album)
Window functionCodeComputer programMereologyOpen sourceSystem identificationSoftware testingGauge theoryDisintegrationLocal ringInstallation artGraphical user interfaceForm (programming)EmailElectronic mailing listWeb pageFluid staticsStructured programmingString (computer science)Embedded systemFormal languageLink (knot theory)BuildingRevision controlData managementMultiplicationModule (mathematics)Data structureSoftwareRepository (publishing)Virtual realityWhiteboardExistential quantificationComputer fileInstance (computer science)Line (geometry)MereologySoftware testingContinuous integrationContext awarenessFluid staticsWeb pagePower (physics)WebsiteMarkup languageHash functionRevision controlEmailPoisson-KlammerBitElectronic mailing listLink (knot theory)Online service providerRoutingBlogString (computer science)Roundness (object)QuicksortSocial classCurveFunctional (mathematics)Library (computing)CASE <Informatik>Type theoryConfiguration spaceCodeMathematicsCodecComplex (psychology)Repository (publishing)Electronic program guideHierarchyParameter (computer programming)DiagramFiber bundleVirtual realityStructural loadDifferent (Kate Ryan album)Similarity (geometry)Branch (computer science)Server (computing)Graph (mathematics)Traffic reportingCuboidPoint (geometry)Module (mathematics)Open sourceSoftwareMultiplicationData structureCausalityGraph (mathematics)Standard deviationXML
Window functionMathematicsGoodness of fitWindowComputer configurationGame controllerResultantSoftware maintenanceComplete metric spaceOpen sourceGroup actionData storage deviceMultiplication signWave packetMoment (mathematics)Equivalence relationMeeting/Interview
Group actionMeeting/Interview
Transcript: English(auto-generated)
It's almost ready. It's Ben Natto. He will talk about his open source Python project, tools for maintaining an open source Python project, a walkthrough of some great tools I
use for developing, testing, maintaining and managing projects. I see he's online, so please share your screen and let's go. Okay, let me get this full screen.
Okay, can you see my slides okay? Perfect. Great. Okay, well thanks for that. So yeah, I'm going to be speaking about tools for maintaining an open source Python project. I'm a software engineer at the BBC. I work in an innovation team called BBC News Labs.
I joined them in January from the Raspberry Pi Foundation, so I was there the last six years, so if you've seen me speak before, I was probably advocating Raspberry Pi. I created the GPIO0 library and the Pi Wheels project and also one of the contributors towards the PiJokes project. I write for opensource.com and you can find me on Twitter and that's my
GitHub. This is what I used to look like in real life and now I kind of look more like this, as you can see, lockdown, lack of haircut and this is what I look like online. This is my avatar. So just to give you a brief of what this talk covers, I'm going to be talking about how to
organise a Python module and how to structure the files, how people distribute software and different methods of doing that and why we do it. Using Git and GitHub to version control and share and collaborate with your software with other people. I'm going to be touching on virtual
environments, testing your software and automating your testing, documentation, documenting your code and your project and a little bit on licensing your software. What this talk is not,
so this is not going to be a thorough follow-along tutorial because I mentioned about 50 different tools and I'm just going to be mentioning them mostly in passing. You're not going to be able to follow along and kind of do examples and things as I go along. I'm just going to kind of quickly brief over them and kind of give you the big picture. If you want to learn
more about each of them, I'll tell you where you can find out more. I'm also not going to be telling you which tools you should be using or which tools to use. It's not my job to be telling you which tools you should be using. I'm just sharing the ones that I use and if I've not mentioned another tool, it's either because I've not come across it or the examples I want to share
with the tool I've chosen. Everyone has the right to choose whatever tools they want to use. I'm also not telling you that in order to be considered a proper Python programmer, you need to know about all of these tools and need to know them inside out. I hope that comes through. Just to give you a background, a lot of the
contents of this talk, where they've come from. The GPI Zero library I mentioned, I developed this when I was working at Raspberry Pi. It's a Python library providing a simple API for physical computing with Raspberry Pi. It eases the learning curve
for young people, beginners, and other educators. It's a nice Pythonic API if you're an advanced programmer, if you're an experienced programmer. It's not just an easier way to do things for kids. It's also quite a nice way to do things once you know good Pythonic, to be able to write
nice Pythonic code with your GPU and your physical computing with Raspberry Pi. You find the docs and the GitHub project there. I just wanted to share that Kida started playing around with a Pi last year, and he tweeted saying it about how he loved the library. A bit of a humble brag there. Quite pleased with that. Pi Wheels, my other project. This is the tooling
for automating building wheels of everything on Pi Pi. Wheels are binary distributions of compiled Python modules. Pywheels.org is the repository that's built from the tooling.
It's a whole repository like PyPi.org. It's a pip compatible repository that hosts arm wheels that have been built by the tooling that is Pi Wheels. We natively compile arm wheels built on Raspberry Pi 3 hardware targeting Raspberry Pi users. The repository is hosted
on a single Raspberry Pi in a cloud platform. That single Pi serves over a million downloads of every month. That's at PiWheels.org. With these two projects, I work with a friend of
mine called Dave Jones. He's a professional programmer and amateur dentist. This is a recent picture of him before he performed some dentistry on his partner over the lockdown period. I'm using this photo with permission. Dave is responsible for implementing my crazy
ideas. With things like GPIO0 on Pi Wheels, what generally happens is I come up with something and say, well, this would be good if we could do this, and he ends up implementing it. The way we tend to work is I write the first 90 percent, and then he goes on and writes the next 90 percent. Dave's co-authored GPIO0 on Pi Wheels with me. He's also got a bunch
of other really, really cool projects that he's built himself as well. Dave introduced me to a lot of the tools that I'm using in this talk, so I wanted to give him a hat tip for that. When we start writing a Python module, it usually looks something like this.
You just have a Python file named after your project, whatever your project is. You write your code in there. Generally speaking, that's how projects start. You start with that. Now, you might want to throw that up onto GitHub. You can create a repository and push your code up to a personal GitHub repository. For instance, under your own
name, so this is my username on GitHub, and this is a project that belongs to my own user account. You push it up. The way GitHub works at a very base level is that it's like providing—this is a folder containing a single file. If I create a Git repository of that and
push it to GitHub, it will essentially put the folder structure of my project online in a really, really basic way. Obviously, it does much more. You can also create a GitHub organization for your project. Especially if you have a project
which comprises multiple components, multiple different repositories, you could have different repositories under a particular organization name. This might be a company. It might be an open-source project. It might be a wider group or something like that. GPI Zero, for instance, you can actually move things from a personal account to being
under an organization. I did that with both of these projects. GPI Zero and PyWills have their own organizations, and the multiple repositories belong in each organization. GitHub provides a way to add collaborators to your project. You can invite individual GitHub members, GitHub users, or you can create a team and say, these people have access to these repositories,
or these people have read access if it's a private repository, that kind of thing. With an organization, obviously, you invite them to the organization, and they have whatever access that you've given them. Git has branches. With GitHub, when you push
up multiple branches of code, you could be working on one feature that's not quite ready to be merged yet, and you could be working on it and share it online on GitHub, and other people can see it. Also, other people could be working on other parts of the project on different branches, and they could be managed that way. With GitHub, you get a visual
representation of what's going on there. With GitHub releases, if you tag a git commit with a release name, then you can share them like this, and you can see the different points at which versions have been released. GitHub provides issues, which are a really good way of
both you accepting bug reports from your users, and also for yourself as the maintainer to drive the development of your project and your roadmap. Actually, the way I tend to do things is, if I want to see a feature in my library that I'm going to implement, I create an issue saying,
it would be nice if we had this, or it would be good if we did this, and describe it in whatever detail, and either I get around to doing it, or somebody else might be able to pick it up in the future and do it themselves, commit the code and close the issue. You can tag issues with labels and organize them in different ways.
Pull requests, this is a way for, once your code is on GitHub and accessible to others, other people could take a clone of your project and commit some code, push it, and request that you merge their changes in. A lot of people are able to contribute to some of these libraries
because it's out there on GitHub and they can contribute, and you have the ability to modify or reject or merge changes as appropriate. GitHub also provides project boards, which are either a way of you organizing your existing issues and pull requests,
however you want, but also you can create little notes, which are just not no issues, but just little bits of text that say things describing your project and features that you want to add or things that you need to address and be able to visualize the state of play, especially if you're collaborating online and not office-based. Having visual representations
is something you might have post-it notes for in an office can be really useful for managing the project. Distributing software, how do we do this? It's quite common for
software to be packaged in such a way that it can be installed by many users. For instance, on Linux, you might expect to be able to install some software with the apps tool, so apt install, such and such, or Fedora, RPM, and Yeoman on other systems.
Then you've got things like PIP, which is a language-specific package manager, so PIP is Python's package manager, NPM is for Node.js, and gem for Ruby. Then there's things like Linux Portable, so Snap, Flatpak, AppImage, different methods of distribution have pros and cons,
and they're quite popular at the moment. Then on Mac, you've got Homebrew, so you can brew install something. Then there's lesser quality access to software, things like just downloading it from lots of less sophisticated ways, so downloading from GitHub directly. I talked about GitHub a lot, and I should mention GitLab, and there are other alternatives
available that provide a lot of the same functionality. Downloading from SourceForge in the olden days, or providing something accessible for download on your personal website, and things like cURL as well, but different methods of distributing in software.
Why do we distribute software? First of all, for ease of access. If you make some software and you want other people to be able to use it, they need to be able to download and install it. If you can do that in a uniform way that matches their expectations, then it'll be much easier for them to use it. If I'm using Linux, or Debian,
or Ubuntu, I expect that if software is available, that it's available in apt for me, so I can apt install it, and I expect that it's just there. Or if it's a Python library or
something, I might expect that I should be able to pip install this, and not have to go and find the website where it's hosted on, or the obscure method of downloading it that they've provided. Especially with apt, and things like that, you have a certain amount of trust and confidence that what you're getting is good quality, and that it's the real deal,
it's from the author themselves, and that you're getting it from an official source. For the stability, so you know that this is coming from the right source, and it might be, especially in something like Debian, you know that this is a stable version that's supported
in Debian, and if it's on pip, you can actually go there and look and see, these are all the version numbers, these are the release dates, what version am I on, and you know where you stand, which is really important. Licensing is important to talk about at this point. It's really important for you to choose a license for your project.
It's really easy to just discard this and say, well, it's just open source. If it's on GitHub, then people can do what they want with it, but actually, what people don't consider is, well, if this happened, would that annoy me, or would I be annoyed by somebody's use of my project? If they started selling it, would I think,
well, that's not fair. If they started using it a particular way, if they renamed your project, if Google took your project and renamed it and released it under their own branding, would you be happy about that, or would you rather choose a license that protects you in however you want to be protected? There's lots to think about. It's not a simple issue.
I'm not going to recommend any particular license. If you go to choosealicense.com, that's a great resource there for describing what it is that your project needs and what your needs are, and it will help you choose a license that's appropriate. It's also important to say that it's important to include the license with the source code, so include it in
your GitHub project, include it in your files, and when you make a distribution that you share, if you're publishing it to PyPy and it's PIP installable, that the license should follow the code wherever it goes. When somebody installs it, the license should be with the code
there. It shouldn't be left to, this came from PyPy and the license is on the GitHub page and that kind of thing. It's important to keep it with the project. So if you want to start creating a Python module, regardless of which method of distribution you're going to use, essentially you want to start with this. So
you've got your project.py that we had before, so that's where your implementation of your project lives. If you stick that in a folder, a directory called, there's the name of your project, and you need to create an init.py with double underscores on each side, so the name for this is dunder init, which means double underscore.
So you stick an init.py in your project folder, have your project code in another file, and you have a readme file. Now I'm going to be talking about different formats for readmes and documentation later, but this is a restructured text file, and you have a setup.py.
So the setup.py might look something like this. This is a reasonably minimal setup.py, so this describes how your project is built and which modules it provides and things like that. So it's using the setup tools module, so it essentially runs on this setup function
provided by setup tools. So setup and you provide it all the different information about your project. So you give it a project name, a version number, well, it's not strictly a number, it's a string, but you can look up about how people tend to version their software.
You give it an author name, a short description, just a one line string, a license, you can provide keywords, a URL to where people can find the project. If you've got a home page or if it's just on GitHub, you can put it there. Now packages, I'm using the find packages
function here provided by setup tools. All that essentially does in this case, because it's a really simple example, is that that will return the string or a list, I think, of the string project, which is the name of the folder, which is this bit becomes importable, becomes distributed on the system when somebody installs it. But find packages will go away and find any
modules that are available provided by your package. And then a long description is what will be shown, you'll see later on, on a PyPy page, the full description of what the project is, which usually you'll read me in the GitHub project. And it's good to be able
to replicate that both on GitHub and on PyPy. I'm just using a read function there to open it from a file. So if you want to publish your Python module on PyPy, that's the Python package index. So first of all, you register an account on PyPy.org, you create a PyPy RC file with your
credentials that you created, so your username and password, and you want to install a tool called Twine. And if you look up on the Python packaging documentation, it didn't used to be as good as it was, but it's got a lot better recently. So there's some really good documentation
that you can find out on how to go through this full process, but that's the gist of it. And once you've published your module, you can see that it can be available as a PyPy project page, something like this. So this is the one line, the short description, this is the full description, which I haven't really made much use of, and the different version
releases and the files that are available and a link to your homepage and all that kind of thing. Becomes available on PyPy. So init.py, there are different choices that you can make about how you structure this. So, I mean, it's possible to just write your full implementation of
your project inside init.py, but people don't tend to do that. The two kind of schools of thought that I use are, so this is one of them, with GPIO0, we want to make it really easy for, it's just a library, so people just import things from it. So we want to be able to
make it easy for them to import it and not worry about a nested structure of different, where different things happen to be implemented in different files and different folders. So we want to be able to provide from GPIO0 import LED or button or servo motor, that kind of thing. We just want them to be able to import all the bits that need at the top level namespace. And so in init.py, we kind of use relative imports to bring in everything that we
need, whether they're scattered around in different files in different locations, we import that and provide it in init.py, which means people can import it easily. And then the setup.py contains things like the version number and that goes straight into setup
and then it isn't being imported from here. With a library like this, where you've got code in your init.py, it's tempting to try and put your version number and things like that in your init.py so that people can import it and see the version, but it can cause conflicts
if you structure it like that, if you're providing imports and things, because when you run setup.py, it tries to import your code, which might have dependencies and then you might start importing things and that might cause you problems if, for instance, your dependencies
aren't available at the time that somebody's trying to build the project. And so the alternative way of doing it, if you're not actually trying to provide, if imports and the import structure isn't the most important thing, if your package is a module
that people install and they get access to command line tools, for instance, rather than a library of things that they import, this is a good way of doing it. So actually putting all your version number and all your setup.py metadata inside init.py and then importing it from your module and passing them into setup. And another thing is entry points. So entry
points are a way of providing access to parts of your program that you want to make available as what we call console scripts. So if you want to make a command line tool, where the command project, for instance, launches some part of your program,
you would do it like this. So you provide entry points in the setup function. You define entry points as a dictionary. Console scripts are one of the types of entry points. There are others. And then that's got a list of each command that you want to provide. So project,
it's a bit odd that the syntax is like this, that it's all just in one string and that the dot here and the colon are kind of syntax within a string, but this is just how it is. So this essentially makes the word project available as a command. And it finds the main function in the CLI file in your project called project.
And once it's installed, you'd be able to do something like this. So virtual environments, a really good way of creating an isolated environment that you pip install your requirements into in your package. You can actually build your project
inside the virtual environment and in such a way that the changes you make in your library are installed in real time. So if you make changes, it's as if you've got the latest version of your project installed in the environment. And you know that it's separated from your wider environments. It's not got your system Python. It doesn't have the system
packages that are installed. And it's just isolated from everything else. I recommend a tool called virtualenv wrapper, which provides this command, make mkvirtualenv. And with this command, you create a virtual environment called project. And
as soon as you've run it, you'll see the word project in brackets below in front of your shell. And then if you're on Linux, for instance, you might be used to using Python 3 as your Python, because that's the system Python. But once you've created a virtual environment, you can tell it to use Python 3, but then Python and pip, they point at your Python and your version
environment, your pip. So you can see I've got which Python and it's put it inside environments project in Python. Use the deactivate command to close the virtual environment. And then you want to switch to another project you can use work on.
So the first time you do it, you don't need to do that because it creates it for you. But if you want to revisit one, just use work on and the project name. Make files. So this is a thing I imagine a lot of people are a little bit, not skeptical of, sort of almost afraid of. They seem like quite a complicated archaic tool,
but if you strip them down to their basics, they can be really useful and actually quite simple. So for instance, everything I've showed you so far, you need to be able to provide a way of people installing from the source code. So pip install dot, and you just provide
the command make install, which wraps around whatever your installed instructions are. And make develop, which in this case, install it in an editable way so that people can develop on the project in their virtual environment. And just, you know, you start small with something like this and later once you've got,
you know, things like test suites and documentation builders and deployments and all that other, all those other things you can define inside here, how each of them should be and then provide them in a really uniform way. So make install, make develop, make test, make deploy, whatever it is that you've got. And you can just, you know, there's a lot more complex things you can do and lots more you can learn, but I think they're a really good way
to get started. And like, like all the things I'm going to be talking about, the best way to learn more is to take a look at other people's projects and see what they do. So testing next. So the whole point of testing is the idea that you write tests to validate what your code is supposed to do. You keep your old tests around to make sure nothing breaks in
future. And if other people are working on it, they don't need to know about those tests. They just need to run them, run the test suite. And if they introduce a bug of some code that, you know, you wrote a year ago, five years ago, the test suite will tell them about it. So there's an approach called test driven
development TDD. So for maximum effect, if you're taking that approach, you write the tests before you write the code. So you kind of write by wishful thinking and say, well, I think the library should do this. And you write how the user would write it. And you say, well, I assert that this would happen when they run this function. And then you see it fail. And
then you go and write the code that actually makes it pass. And then, you know, you kind of So you can, you know, you can write tests that run really quickly. And it's important that they do run fast because then you're not held up by waiting for your test to run.
And it can be automated. So once you push, it can run on something like Travis, which I'm going to be talking about, automatically so that you can so you can just see instantly on, for instance, if somebody else writes some code, sends you a pull request, you can see, did the test pass or did they add any new tests,
that kind of thing. So important to be pragmatic when you're when you're writing tests, test edge cases don't exhaustively test every single possible combination of inputs. It will run slowly and it's not an effective way of testing. But it is difficult, is an art form. Writing good tests is a complex task. And, you know, like all these things,
it's a learning curve. So having tests is better than not. But having too many tests or having exhausted tests is not that useful. So the easiest way I think to get started with testing is not using any testing libraries, not installing anything, just using
a built in keyword assert. So if, for instance, your project defines a function add, which takes numbers and adds them, you can just import that function and assert that add two comma two equals four. And if this if this didn't return four, it would fail and
there would be an assertion error. So just a standard Python exception. And but if it passes, it just carries on. So they're really, really useful way of just really quickly testing things. A good way to structure it is to put them in functions like this. So have test add and then have multiple tests in here. And with pytest, which is a really cool testing library,
but it can, at its most basic level, can be a really nice runner for your standard tests, your assert tests. But if you name the functions like this and name your files like this,
so you have a test folder and you name your files test underscore something and have your functions named test underscore something, it will run them. And so you can see my structure looks something like this. I've got test ad and that contains a test called test ad. And you can see that when I ran that, it passed. It's just bog standard, simple example.
But you can imagine for much bigger projects, you have reams and reams of tests passing and seeing when anything fails, when you've broken anything. Pytest also gives you some additional features. The main one I use is testing assertions,
sorry, testing exceptions. So it's quite difficult just for using assert. Well, it's impossible using assert on its own to assert that an exception got raised because that will blow up your program. So the way you do it is you import pytest and you say with pytest.raises some error and you use the context manager and put your code that
you're expecting to raise the error inside. Now, if it doesn't raise the error, then the assertion fails. So that's a good way of testing that as well. Mock is a really good library as well. So this is since Python 3.0 something, this has been in the standard library and in the unit test module.
So this is a really simple example of using mock in your tests. So you can create a mock object that in this case contains a method called message that just has the return value of hello. So you're kind of mocking up an object that has an attribute that's a function that has a method that has some predetermined return value. And so
you can see there, I've got my mock object and that's the repper. And when I call m.message, I get the string hello back. And another thing that mock comes with is something called patch, which is a good way to patch some functionality that's not in your library,
but perhaps your library relies on. So something like this, this is from the GPR zero tests. We have an interface for dealing with the time of day. So it's a time object is active between the times that you set and it's inactive outside of those times. So you could wire this up to say
an LED and say, well, this LED should be active when this time of day object is active. So the LED is on during between the hours of seven and eight. Much the same as a button could be connected to an LED and you could press the button and that is what controls the
LED. Well, this is a time construct rather than a physical button. And so with that, I'm obviously using datetime underneath. And so I have to patch the instance of datetime within my library and say, well, I'm going to test it. I'm going to say, well, when they call datetime, the first time, I want it to return this particular date. So at 6.59, I should assert that
the time of day is not active because it's not seven yet. And then I should be able to tell it the next time you call datetime, return this. And it's now 7 a.m. And now the assertion should be true. And then at 8 a.m., it should still be true. And at 8.01,
it should have gone back to being inactive. So just all I'm doing is patching datetime. I'm still actually testing the library, still doing an effective test, but it's the thing underneath that I can't control rather than have to subtract and say, well, take the current time and add a minute and blah, blah, blah. That's a much simpler way of doing it.
Tox is a really cool tool for running your tests in multiple Python versions. So if you're on Ubuntu, if you look up something called the dead snakes PPA, you can apt install multiple Python versions, not just the one that comes with your distribution of Ubuntu.
And all it takes is a tox configuration file that describes which Python versions you want to run your tests in and you have to have them installed or otherwise it will just give warnings saying couldn't find this Python. So that's a really good way of just on your machine being able to run the tests in multiple Python versions. There's a lot of times when, you know,
if you still support an older version of Python and you're using a new bit of functionality like fstrings or something like that, but yeah, it's passing on your machine because you're running Python 3.7 or 3.8. But then you see Tox, you know, tells you, oh, this won't fail because I don't know what this
fstring thing is. So it's good to be able to do that. Coverage.py is another really cool tool that does coverage analysis of your programs. So based on your test suite, it checks which parts of your code, which lines of code have been touched by your tests and which ones haven't. And if you've got something like a tree, like a,
you know, an if or a for or a try accept or something, and it's going, there's multiple different ways that it could go. It will identify which, which branches of those trees didn't, didn't get touched. So it could even be something like, you know, it always goes into the if and never goes to the else, or, you know, something like that. And it just shows you
actually you're not testing this part of functionality, which is a good way of finding, you know, getting good coverage of your tests. It's not completely foolproof because it's really easy to just obey the thing and fill in all the gaps, you know, and sort of hack your way
through it. But, you know, again, it's an art, lots to learn, but it's a good indication. So coverage for GPIO0, for instance, gives you something like this. So these are all the different files and it shows you which lines are missing from the tests. And, you know, if something's at 98% or something, perhaps you're not that bothered, but if something's a lot lower, you might want to go and investigate, well, actually we're not testing large parts of this,
of this file. So Travis CI, CI is continuous integration. So this is an online service, which is free to use if your project is open source. And you can, you can run, you can define which Python versions to run, but as soon as you push to GitHub, or if there's a
branch or a pull request, it will run all your tests on the Travis servers and give you a report saying, you know, it passed on all these versions or it failed on 3.5 or whatever, which is really useful. And it also feeds back to your, as does code coverage, they feed back
to your GitHub. If it's a pull request, for instance, it will post an issue, sorry, post a comment on the pull request saying, yeah, all the tests passed, but the, the code coverage went down by 1. something percent, that kind of thing, which can be
really useful both for you as the maintainer and for any contributors that, that did the, uh, the, the file, the PR. Uh, so again, just revisiting make files, example here, uh, because you know, the tests, uh, it's, it's sort of a non-trivial, it's not just a case of you type pytest and hit enter. It's because I'm using coverage
aligned with pytest and I'm using a particular configuration file. Defining that in here just makes it much easier for people because they just know I run make test. And if that changes in future, if, if the, um, if I'm, if I'm adding another library that I'm using underneath or changing it somehow, or the pytest command changes, um, you know, it's, it's still that
make test still works. You just update the, uh, the definition. Uh, so again, for stuff like this, it's really, really useful to have simple make tests for that kind of thing. Documentation. So, um, um, there are kind of, according to Daniel A., um, of Divio,
who's a friend of the Python community, um, and previous chair of the PyCon UK conference in society. Uh, he does a brilliant blog post on this, which you can, you can read up on the Divio website. There are four types of documentation. He says, uh, there's tutorials,
there are how to guides, explanation and reference. So reference is one quite common one. You'll find people document their APIs. So I'll say, this is a function that has this, this is a method that works like this. This takes this, these parameters, that kind of thing. Um, but they'll also kind of bundle in things like backstory and,
Oh, you know, this is an in-joke and you know, Oh, this is, this is how you install it. And if you're on a Mac, then you do this and it kind of bloats and becomes really messy. So he, his whole proposal is that we should be splitting these into, into those four things. But he gives a brilliant talk about that whole concept, which is really, and really worth reading about it on their website as well. But yeah,
documentation is really useful. So, um, again, looking at really easy ways to get started with these things and looking at more, uh, advanced routes as well. So a really easy way is to document stuff is just put readme files, uh, in your GitHub repository. So write them in, um, in markdown. And, um, if people, if somebody's looking at your
project, even if it's not published onto PyPI or it's not a Debian package or whatever, um, just being able to come across it on GitHub, you can read, uh, on the readme. Um, so this is just a couple of examples, um, of what readme, uh, what markdown documentation
looks like on GitHub. And so markdown looks like this. So, um, it's really simple syntax, really, really, really simple syntax, um, for, for writing stuff. So you've got a hash here, which is, which is a title, two hashes is a header. Just text on its own is just a paragraph, uh, use hyphens or, or asterisks to, to make a list. And a link looks like this. You
put the square brackets around the text and round brackets around the link itself. Um, there's also a project called MK docs, which is, um, a markdown, markdown based documentation builder and it exports static HTML for websites of your markdown documentation, easy to write. It's easy to deploy and you can self host it or put it on GitHub pages
or something like that. And that's, uh, what you can do with it. Restructured text. Uh, I find a much, much more, um, stringent sort of market language, um, quite complex, quite, um, quite a lot, quite a learning curve, but you know, in essence, this is similar to,
to what I, what I showed previously. So you've got a title, it's a bit more, the bugs, you've got a bit more stuff ahead of two list, uh, list items. Uh, but this thing is something that's a little bit different. So this is a link that's pointing to another documentation page. So with another page within the project, and you can do things like that. They're a little bit more clever, a little bit more sophisticated, and it will actually get the,
um, the page title, cause it has context of what all the other pages are. And it will, um, include, uh, the page title and a link that way. Um, so there's a project called Sphinx, which uses restructured text. Um, you can, uh, what's really clever about it is you can,
it extracts the docs from your docstrings. So if you write docstrings anyway, you've kind of already written your documentation and it will build you a site out of your docstrings. You have the power to kind of choose which pages and where, where things go, for instance. Uh, you can also link to, um, link to the Python documentation. So if you
link to a Python function or a Python class from the standard library, uh, you can also link to other, um, uh, other Sphinx projects. Um, uh, if you, if you, you know, perhaps your dependent libraries, so Sphinx, for instance, something like this, you can write, um, you can write a page and say, well, I want to have this paragraph and this title. Um,
and then for each class and you just say auto class, and you tell it which, which, um, parameters to provide and, uh, which, which methods and things to provide, and it will automatically grab them from your documentation and fill them out like this. Uh, the PI you might be familiar with Sphinx because the Python documentation and a lot of
projects in the Python ecosystem are use Sphinx. Um, uh, read the docs is quite a common, um, method of deployment for those. You can have multiple branches and multiple versions and, and access, um, hope, hope projects that way, uh, really used to automate. As soon as you
do a release, it, um, or to automate a new build of, of your documentation on every branch or every, uh, new release. And, um, graph is, is something I use in my documentation as really cool way of creating little graphs to describe parts of your project. Um, I won't go
over this, but, uh, this is just a way of describing the relationship between two boxes and that kind of thing. Uh, and you can do more complex things like class hierarchy diagrams, which are automated from your Python code, which can be really cool. So we've got a load of stuff in our project now. Um, so just to summarize what we, what we discussed,
so how to organize your Python module and the module structure, uh, distributing software, pipeline PIP using get hub and all the different, um, tools that it provides virtual environments, testing and automated testing, uh, documentation and, uh, software
licensing. Uh, I, I tend to, um, write about tools like this that I come across. Uh, I've got a tooling blog at tooling.bennotable.com inspired by my friend Les, does something similar and, uh, I kind of post on there, uh, every now and then. So if you're interested in this kind of thing, you know, new tools that I come across, do, do you follow that? Um, and that's all from me. Thanks very much.
Thank you very much. We already have questions here. We are a little bit late, but we have enough time for two questions. The first one is you mentioned, uh, for Linux,
those packages, PIP, et cetera, and also for Mac, what about windows packaging options with Python? So PIP is, is, is compatible with windows. Um, so you know, a lot, uh, for most things, most projects, you'd be able to use PIP exactly the same on windows. Um, I don't know a lot
about, um, windows, um, packaging beyond, um, beyond that. Uh, but there are people out there in the ecosystem making, making, I know that making, making it work and there were some really good Python community members that work at Microsoft and working on those kinds of things. So, um, I don't have any particular expertise to be able
to answer that, but I know that there are, there are options, you know, and Python itself, I know it's in the windows 10 store now, so I know it's, it's easy to get Python and, uh, and it comes with PIP. So yeah, you can use PIP. Um, but, uh, there isn't necessarily equivalent of something like apt for windows and not quite in the same way. I
think there's something going on at the moment, but, um, not quite a complete picture of the, the open source ecosystem, the way there is on, on say Debian. Okay. There is another question and also unfortunate last question for this session.
What are your thoughts using GitHub actions instead of train with CI? Uh, I haven't used it yet. Uh, it looks really interesting. I'd be meaning to take a look. Yeah, definitely worth looking at. I think there are some people, um, I've seen in Python ecosystem using it and seeing some good things about it. Okay. Thank you very much again.
Thank you. And if you want to ask more questions, please go to the discord channel. You can reach that by pressing control or command K and then typing maintaining. And then you see the first, first, uh, result search result is the channel for the talk. And I see there's
already some action there, so please continue there. Thank you.