We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Code quality in Python

00:00

Formal Metadata

Title
Code quality in Python
Subtitle
A reasonable approach to measuring code quality in your projects.
Title of Series
Number of Parts
118
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Four years ago I talked about code quality during EuroPython in Bilbao. A lot of things changed from that time. Firstly, most tools I presented were still developed and gained new features, but also new ones appeared I wanted to discuss with you. Secondly, Python changed, Python 3 has type hints on board and there is a new tool dedicated to checking the types. Thirdly - I changed. I'm more distanced form my zealous approach from four years ago. I got real and reasonable. That's why I wanted to talk about code quality tools in Python again. I'll talk about all the software that can make code review a bit simpler by pointing out possible errors, duplicates or unused code. I'll talk again about formatters and how can they be used in modern-time projects. And I'll talk about hobgoblins, if you know what I mean :)
Keywords
20
58
Machine codePoint cloudGoogolSoftwareComputer iconReduction of orderArithmetic meanLecture/Conference
Wide area networkGamma functionTunisLattice (order)Pauli exclusion principlePerspective (visual)Extreme programmingRule of inferenceImaginary numberSummierbarkeitModule (mathematics)Machine codeMultiplication signPerturbation theoryStandard deviationLattice (order)Perspective (visual)SoftwareBitView (database)Latent heatSoftware testingSoftware frameworkInjektivitätLevel (video gaming)
Pauli exclusion principleLattice (order)Extreme programmingPerspective (visual)Rule of inferenceSoftwareRevision controlResultantRule of inferencePerturbation theoryProjective planeComputer animation
Pauli exclusion principleLattice (order)Perspective (visual)Rule of inferenceExtreme programmingSoftware developerError messageSparse matrixPersonal digital assistantImplementationBeat (acoustics)Machine codeSpacetimeComputer configurationConfiguration spaceComplex (psychology)Machine codeType theoryRule of inferenceAxiom of choiceRegulärer Ausdruck <Textverarbeitung>Mathematical analysisArithmetic meanGraph coloringLine (geometry)Latent heatBitClient (computing)Software developerCopyright infringementAbsolute valuePerturbation theoryMathematicsComputer fileProjective planeMobile appFluid staticsLattice (order)Complex (psychology)Configuration spaceAverageControl flowDefault (computer science)SpacetimeChaos (cosmogony)Physical systemGoodness of fitHidden Markov modelComputer animation
Complex (psychology)System callRule of inferenceSoftwarePlug-in (computing)Programmer (hardware)Multiplication signExpressionSummenregelRule of inferenceMathematical analysisSoftwareFluid staticsSlide ruleComputer animation
Plug-in (computing)Programmer (hardware)CurvatureError messageConfiguration spaceSoftware developerRule of inferenceProjective planePlug-in (computing)Line (geometry)Module (mathematics)2 (number)SpacetimeMachine codeComputer animation
Configuration spaceCurvaturePlug-in (computing)Error messageSoftwareRevision controlPlug-in (computing)Configuration spaceProjective planeComputer fileMathematical analysisMachine codeSelf-organizationCASE <Informatik>Electronic program guideComputer animation
CurvaturePlug-in (computing)Error messageConfiguration spaceDifferenz <Mathematik>Population densitySoftware testingMachine codeCrash (computing)Personal digital assistantRule of inferenceFunction (mathematics)FeedbackString (computer science)Block (periodic table)Software developerType theoryComputer clusterPerspective (visual)Process (computing)ImplementationCode refactoringGroup actionComputer fileElectronic mailing listRule of inferenceProjective planeRegulärer Ausdruck <Textverarbeitung>CASE <Informatik>Mathematical analysisLink (knot theory)Machine codeFluid staticsGoodness of fitAxiom of choiceCloningError messageControl flow graphMultiplication signLine (geometry)Maxima and minimaFunctional (mathematics)Execution unitData dictionaryConfiguration spacePosition operatorUsabilitySoftware testingMathematicsClique-widthSoftware maintenanceType theoryInterior (topology)Video gameDifferenz <Mathematik>System callCode refactoringString (computer science)Coefficient of determinationProduct (business)Equaliser (mathematics)ProgrammierstilShape (magazine)Unit testingSlide ruleCovering spaceComputer animation
Group actionError messageSoftwareSoftware developerFeedbackFluid staticsLink (knot theory)Mathematical analysisImplementationMachine codeGoodness of fitComputer animation
Musical ensembleMachine codeLecture/Conference
Open sourceFreezingComputer fileProjective planeMobile appLecture/Conference
Transcript: English(auto-generated)
My name is Radosov Gancalek, but you can call me Rad. It's much simpler and more available for English speakers. This is not my first EuroPython, and this is not my first talk on EuroPython, because I was here, and by here I don't mean Basel,
in Bilbao on EuroPython 2015. And my main title of my talk was pretty much the same. But now the subtitle is different, the reasonable approach. So what happened?
Why are you reasonable? Because over time your view on things change. And I wanted to share you a couple of thoughts I have after a couple of years of fighting for a better quality of Python software.
So this talk will be mainly about tools. Tools that we use to check the quality or fix things, formatting. And it will be mainly about Python 3, and the tools that are most up-to-date now,
that are fresh enough to be safe to use. And it will be only about code quality checking, because there are lots of tools in Python for other things, like for checking for SQL injection, for checking the spelling in your code.
So I will just focus on code quality, and just a little bit about testing. What I want to talk about is IDE-specific and from the framework-specific tools, because there are very many of them.
So they deserve their own talks. Okay, so let's start with meeting my friends. Unfortunately, they won't join me on the stage, because they are imaginary. First of them is the Hot Goblin.
You know the quote about the Hot Goblin? Hot Goblins of little minds and stuff. So let's sum up this approach. So what does it mean to be a Hot Goblin?
And do you, are you a Hot Goblin? Or are we, or am I a Hot Goblin? So Hot Goblin is narrow-minded. So it doesn't care about what happens, everything, everywhere else, except his own piece of code, his beautiful module that he's working on.
And from that reason also, he likes business perspective. He doesn't mean about, he doesn't think about business. He thinks about making his code most beautiful and most according to the standards as he can. And third thing is that Hot Goblins are extreme.
Yes, ah, okay, okay, no problem. Okay, how do you, yeah, okay, that's good. Thanks, thanks, so let's talk about Hot Goblins more.
So, and Hot Goblins are extreme. They either want to follow all the rules or none of the rules, because only this makes sense for a Hot Goblin. Also, which is a result of all of that,
Hot Goblins value rules the most. So project, shipping the valuable version of the project, valuable for client is not important for them. Okay, so let's meet another friend.
Meet Timmy. Timmy is average Python developer. Probably all you know Timmy, if you worked in a Python project. Timmy is very skilled. He has some experience a couple of years and he has his own style of coding Python,
which might be in some ways unorthodox. And Timmy, some people say Timmy is mean and they don't like Timmy, but the truth is that Timmy is afraid of changes. And that's why he doesn't like if you come and say,
oh, we're going to measure the quality of our code. Okay, so the third friend, Zen of Python. Probably you all know this. But I won't talk much about it because we are going to talk about it in detail.
So for Hot Goblin, Zen of Python is like set of absolute rules. And Hot Goblin also tends to justify his choices by twisting a bit the Zen of Python guidelines. He said, oh, okay, but the code should be readable.
So we won't use these features from Python 3 because nobody knows Python 3. And on the other hand, Timmy is like, oh, it's like pirate code, guidelines, more guidelines than rules. We need to embrace the spirit of Zen of Python,
not follow the actual rules. Yes, so both of them are in some way wrong. Okay, so let's start with the first one about beautiful is better than ugly.
What does it mean? It can mean lots of things, in fact, but there is something in our mind, there's some, our mind does a couple of tricks. For example, beautiful people seem to be more honest,
more good, and the same is with code. If you look on the code and it looks beautiful, it has lots of space, beautiful line breaks, and it's well divided, you have a feeling that it's well written.
It's not always true, but it's good to have a good first impression. And to have a good first impression, we can use formatters, formatters that will format our code for us, and there are two main players
I identified, so first is black. So if you know this quote by Henry Ford, that your car can be any color you want except it is black, and that's how black works. It has pre-picked formatting rules, no configuration at all, and yes,
it will format your code according to the rules that black has. On the other hand, you have the app, written by Google, and the app is very configurable. You can change many things, but the drawback of this solution is that if you gather your people,
gather your fellow developers, and say, okay, let's sit and write configuration file for the app and pick the rules, you're going to spend a month on it, and maybe even more. So black is a good, as a default, but the app is a good choice if you want to really pick the right rules for your system.
And yes, so the Hobgoblin approach here is to just enable everything. Enable everything and let the guys be shocked by what happened. On the other hand, there is Timmy.
Timmy is developing Python for like five years. He has his own formatting style, and he really can't, he thinks he can't read any other code, but if we have five Timmys and each of them has different style, it might be at, our code might change it to chaos,
which everybody is a little shifting to his own preferences, so using a formatter is a good choice, a formatter we all agreed on, of course.
Yes, so let's talk about typing. I won't tell you if types in Python 3 are good or bad. You probably have already your own opinion about it, but if you are using types,
it's good to check them with MyPy or Pyer. They are the tools that are more or less the same, but you need to check the details, know which one you want. Pyer aims to be more, to be faster and more thorough,
but now it's still developing. Yes, and let's talk about the simple things, or complex things. Let's say you have a project
when you have very specific needs for static analysis. You have lots of regular expressions and you have a specific style of regular expression that you use and you want it to be followed. So one solution is to write your own static analysis tool
which can take some time. On the other hand, you can use belly button. In belly button, you just write your own rules using the AST expressions. So it's relatively faster than writing your own tool
and can be, and adding additional rules. If you, once you learn how to add them, how to write the rules definition, adding additional rules would be even faster. So as you can see, Hobgoblin doesn't like this tool at all because Hobgoblin likes to just check all the rules,
turn on all the rules, and then sit and watch how the world burns. And belly button just makes you to ask yourself, what do I want to check? What do I want to be to be checked in my software?
Okay, so let's talk about PyLint. Who uses PyLint? Wow, okay, so that might be not a slide for you exactly, but I know people who just hate PyLint
or more exactly are afraid of PyLint. Is anybody here who is afraid of PyLint? Okay, so people who are afraid of PyLint, I don't mean you because I don't know you, but people who are afraid of PyLint tend to say, oh, it's enough to just take flake8
and download 50 plugins and it's almost like PyLint. It's not like PyLint, it's almost like PyLint, it's not PyLint. So I know why PyLint might be scary because it has many rules, it has many things,
and the most important thing is that if a formatting tool shows you a violation, you can fix it just in like five seconds. You need to add a new line, add spaces, but PyLint sometimes can show you a violation
that makes you refactor all your code, or maybe a whole module, and that's a bit painful about PyLint. After, however, if you follow it from the start of your project, it's much simpler following it.
I will tell later about how to deal with old project and the new tools. Yes, and Timmy will say that, well, what are these rules for? It's, they're just good practices, every developer follows them. But also every developer is a human
and humans make mistakes. Even now, for example, it's maybe much simpler topic, but do you ever make PEP8 mistakes? Yes, yes, we do it, we do it.
Even if we work 10 years in Python, we are still doing these mistakes, maybe less often than in the beginning, but it's not something that you can just learn and it's all good after it. You need to check. Okay, so let's talk about big, big tools.
There are many, there are many of them, or maybe not so many, but they are big. So first of them is Prospector. Prospector is a very nice tool, it's in the GitHub organization with all the cool Python code analysis tools.
And also there is another one, there is WeMake Python Style Guide, which uses mostly the Flake 8, which is a custom configuration with all Flake 8 plugins.
And there is also PyLama and Flake 8. I think everybody uses Flake 8. I don't, but it's very common. I have a problem with big tools and you can see it, you can see this problem here. Let's say I wanted to use Bandit and PyLint.
I don't have it. I might use Prospector and then use Bandit, but what's the, in that case, what do I win for using Prospector? Maybe a couple of configuration files less. But yes, but the big tools,
they are good for checking things fast, if you are doing an audit or you have something, some old project you want to quickly check things. But they can use outdated tools.
Here, tools that are not bolded, they are outdated. They didn't have any new version last half a year. So yeah, you must think about it if you want to use it. Also, I tried to use both of these big tools
just after installing it and they didn't work. Maybe Prospector worked, but Prospector couldn't run PyLint for some reason. So big tools may have big errors. And if you have a set of smaller tools,
you can deal with errors individually. Also, you don't have any choice of what tools will you have in your project if you use just the one big. In fact, you choose what they choose for you. And one configuration file, okay,
it might be not a bad thing. After all, we use setup CFG for all the configuration. And yes, so for Timmy, one big tool is a good choice because it's just one tool, it checks, it's over.
But in real, usually you want to pick your tools and have only those that you need. For example, if you don't have docstrings, why do you need pydoc style?
Okay, and let's say we have a big project and we have installed a couple of tools for analyzing code. And we have code coverage, code coverage tool enabled.
And so what do we do to not be flooded with all the violations? There is a tool called diffcover which also has a binary called diffquality that checks the violations or coverage only on the lines that you modified.
Of course, there might be a problem with it because if you modify the line and made an error somewhere else, it won't catch it. But it's good enough for a quick check. And also, in general, in measuring test coverage,
there is a trap that the coverage in fact tells you what code is bad, not what code is good. Like if you have 85% of code covered in unit tests, it means that 15% is, there might be something wrong in 15%
and 85% might be good. Okay, so let's talk about readability. Python has a tool called Vulture which is very, it can't be used very, very often because Vulture looks for unused stuff in Python code.
And if you code in Python for more than one day, you know probably that in Python there are many situations when you just change the way how the stuff is used. You make a dictionary of functions or call a method via get at through function.
So sometimes it's not visible that something is used. So Vulture is giving lots of false positives. But after all, if you filter them, if you somehow look into them, you can track the really unused pieces of code.
So be careful with Vultures. Okay, so what about special cases? Hobgoblin will likely say that, oh, we have special cases, you're not good enough. We all know that it's not true
because Python projects have many shapes and many topics. So there are various crazy things that happen in Python code. And if you're using tools for checking the quality, there are many features that can help you. You can, in Python for example,
you can use regular expressions for file discovery or even ignoring lines. Instead of letting pylint or pycode style discover the Python files, you can prepare with the bash comment a list of files
that really needs to be checked. And also, in each that case, when someone tells that, oh, it's a special case, I try to think, is this case really special? Because it's easy to say, oh, it's a special case. Let's disable everything here. It must stay how it is.
So don't, going the easy way is not always the good. Yeah, so there is a tool I found, PyGATRA. And I thought it was a clone of pylint or something like this, but not. It is just a tool with some custom static analysis checks,
which is my noting, there are some checks about regular expressions. So I don't, I can't recommend this tool blindly, but you need to check if the checks that PyGATRA has have something
that is really useful for your project. Because if yes, you might waste some time trying to check it or write your own checks. Okay, so what about checking code automatically? In your code, if you have a CI,
you can use talks for configuring all the checks. So really, you don't, if you are using multiple tools, as I recommended, you are not, you don't need to run them independently. You have talks, you have bash, you can write a command that can run it. And in the CI, if you have Jenkins
or GitLab CI or anything else, it's really useful that the code checking goes before the tests, because if anything fails, you will have a fast answer.
Okay, this is about the same as the previous, about special cases, because we really like just disabling rules if anything goes wrong, especially if it's Friday, 3 p.m. and we want to go home. There is also an automatic tool for setting types,
but it can be used for old projects that don't have types, but not for day-to-day work. So to sum up, if you are going to use tools in your project, pig, bandit, or dodgy, no dodgy,
pig, mypy, py or pytype, pycode style is a minimum, pydoc style if you are using docstrings, pyflakes or pylint, isort, and black for formatting. Here's a slide with Spanish Inquisition. Okay, if you have an existing project
and you want to start, you can use tools like AutoPEP8 or AutoFlake to automatically fix issues, but the fixes need to be overseen by a human. I sort for sorting, imports pycode style, pylint. Pycode style and pylint is a bare minimum for me.
Yeah, but the most important thing is that you should talk to your team, because it is very important so that everybody accepts the changes you are going to, accepts the tool you are going to introduce, because it might require some additional effort.
Okay, I can skip this one. But the advantages are very good. You can reach more readability, maintainability, enhanced refactoring, and it might need some additional time and it might produce false positives,
but for me, advantages overcome the disadvantages. If you're looking for good tools for Python, visit PyCQA or awesome static analysis links. And after all, if you want to introduce,
by introducing tools for analysis of code in Python, analysis of good practices, you want to inspire people and lead them to better implementation of code, and not to be a bully. That's all.
Thank you, Radoslav. If you have questions, you can raise your hand and I will go to you with a microphone, or you can go to the microphone stands. Hi, great talk, and do you have any suggestions for third party, like for your requirements,
like to check that code, something like NPN auditor for Node.js for Python? For Node.js? No, no, no, for Python, to check your requirements for third party. Our requirements, yes. Yes, for example, there is, I only know there's an online tool available
for open source project called PyApp. They are checking the requirements and freezing them in the requirements.txt file. I found the tool for checking the requirements, but it was very outdated, so, but I think that there might be something I missed,
so I encourage you to look by yourself. Okay, thank you.