We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Python Anti-Patterns

00:00

Formale Metadaten

Titel
Python Anti-Patterns
Serientitel
Anzahl der Teile
115
Autor
Mitwirkende
Lizenz
CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Most people heard at least once or focused really hard on studying design patterns. But did you know there are also lots of anti-patterns we should try to avoid? In this talk I intend to present some of the most known anti-design patterns that we'll promise never to use ... but for some reason or slip end up using it anyway. I will start showing more generic anti-patterns that apply to all languages and software in general, and move on to Python specifics, by using The Little Book of Python Anti-Patterns. Since there will be Python code, attendees should fell confort with some basic to intermediate Python knowledge. For example: know about classes, constructors, parameters/arguments, etc.
GoogolCoxeter-GruppeFormale SpracheCodeGenerizitätAusnahmebehandlungMusterspracheCASE <Informatik>ThumbnailSchlussregelGüte der AnpassungKlasse <Mathematik>Kategorie <Mathematik>SoftwareMonster-GruppeInterface <Schaltung>FunktionalMailing-ListeGebäude <Mathematik>DebuggingMultiplikationsoperatorZweiOpen SourceLastProfil <Aerodynamik>VariableGeradeSpeicherverwaltungUniversal product codeKundendatenbankKontrollstrukturEntwurfsmusterWeb-DesignerMessage-PassingZahlenbereichMinkowski-MetrikCodierungGrundsätze ordnungsmäßiger DatenverarbeitungFehlermeldungTaskPauli-PrinzipKonstanteGarbentheorieVollständiger VerbandPunktBenutzerschnittstellenverwaltungssystemSoftwareentwicklerSkriptsprachePeer-to-Peer-NetzSondierungProdukt <Mathematik>ClientPufferüberlaufKeller <Informatik>Befehl <Informatik>BitWeb SiteNeuroinformatikBenutzerbeteiligungHoaxFront-End <Software>VerschlingungMAPLoopComputersicherheitSchedulingSoftwarewartungUmwandlungsenthalpieFlächeninhaltDeskriptive StatistikBasis <Mathematik>ZeichenvorratVersionsverwaltungÜberlagerung <Mathematik>ProgrammiergerätMigration <Informatik>GrenzschichtablösungBildschirmfensterMereologieEigentliche AbbildungGlobale OptimierungMarketinginformationssystemPlastikkarteRechenschieberDatenfeldRechter WinkelTopologieZeichenketteDefaultDifferenteAppletEinsProjektive EbeneSystemzusammenbruchComputerarchitekturProgrammbibliothekBoolesche AlgebraHilfesystemLesen <Datenverarbeitung>InformationBimodulDatenverwaltungsinc-FunktionNichtlinearer OperatorDatenstrukturProzess <Informatik>OrtsoperatorHalbleiterspeicherOffene MengeTypentheorieElektronische PublikationKontextbezogenes SystemVererbungshierarchieInstantiierungBenutzerfreundlichkeitDatenflussDivisionObjekt <Kategorie>System FTupelExogene VariableSpannweite <Stochastik>SystemplattformEinfache GenauigkeitAbstraktionsebeneAutomatische HandlungsplanungDickeFormation <Mathematik>ProgrammierungWurzel <Mathematik>SystemaufrufInternetworkingProgrammfehler
MusterspracheBasis <Mathematik>SchedulingGenerizitätFlächeninhaltFront-End <Software>SoftwareentwicklerCoxeter-GruppeWeb SiteCodeNeuroinformatikMAPBenutzerbeteiligungUmwandlungsenthalpieRechter WinkelOpen SourceBesprechung/Interview
MultiplikationsoperatorCodeGrenzschichtablösungCoxeter-GruppeProgrammiergerätMAPÜberlagerung <Mathematik>Computeranimation
Formale SpracheGenerizitätMusterspracheMonster-GruppeMonster-GruppeMusterspracheGenerizitätTopologieMarketinginformationssystemDatenfeldDifferenteMultiplikationsoperator
RenderingKategorie <Mathematik>Projektive EbeneKategorie <Mathematik>MusterspracheDatenverwaltung
Coxeter-GruppeZahlenbereichCodeGlobale OptimierungLastSoftwaretestBenchmarkDatenflussPunktMultiplikationsoperatorProjektive EbeneTaskSpeicherabzugCodierungZahlenbereichGrundsätze ordnungsmäßiger DatenverarbeitungDatenstrukturExogene VariableVollständiger VerbandSoftwareGeradeKlasse <Mathematik>Objekt <Kategorie>SoftwareentwicklerCodeMusterspracheProdukt <Mathematik>ClientAutomatische HandlungsplanungUmwandlungsenthalpieKeller <Informatik>AbstraktionsebeneInterface <Schaltung>PufferüberlaufWurzel <Mathematik>Internetworking
Migration <Informatik>CodeStichprobeFehlermeldungLesen <Datenverarbeitung>Offene MengePasswortInterface <Schaltung>RechteckCliquenweiteMailing-ListeHalbleiterspeicherGüte der AnpassungAusnahmebehandlungThumbnailCASE <Informatik>VerschlingungKlasse <Mathematik>EinsZeichenketteMultiplikationsoperatorCodeMinkowski-MetrikSchlussregelSoftwareentwicklerArithmetisches MittelMailing-ListeInterface <Schaltung>ClientGeradeFunktionalOffene MengeKontextbezogenes SystemDatenverwaltungProgrammierumgebungKontrollstrukturElektronische PublikationDivisionFehlermeldungSystem FLesen <Datenverarbeitung>Rechter WinkelFormale SpracheMigration <Informatik>TypentheorieComputersicherheitAbgeschlossene Menge
SoftwareentwicklerMedianwertStichprobeCodePufferüberlaufSondierungKeller <Informatik>CodeVerschlingungBildschirmfensterLoopMinkowski-MetrikKontrollstrukturMereologieGeradeComputeranimation
HochdruckCodeStichprobeData DictionaryData DictionaryZeichenketteDefaultAppletXML
StichprobeCodeCliquenweiteRechteckFahne <Mathematik>HochdruckVariableSkriptspracheProdukt <Mathematik>ProgrammbibliothekBimodulBitBefehl <Informatik>SpeicherverwaltungKlasse <Mathematik>CodeFehlermeldungRechenschieberHalbleiterspeicherZeichenvorratBoolesche AlgebraMultiplikationsoperatorIdeal <Mathematik>Deskriptive StatistikInformationSoftwareentwicklerTypentheorieOrtsoperator
TypentheorieCodeStichprobeDatentypHochdruckMailing-ListeKreisflächeRechteckAutomatische IndexierungArithmetisches MittelTupelVererbungshierarchieInstantiierungKlasse <Mathematik>SystemplattformNichtlinearer OperatorRechenschieberCodeTypentheorieCoxeter-GruppeRechter Winkel
FehlermeldungDickeProgrammfehlerGeradeGarbentheorieVariableAutomatische DifferentiationEinfache GenauigkeitCodeGrundsätze ordnungsmäßiger DatenverarbeitungProjektive EbeneGüte der AnpassungVorlesung/KonferenzComputeranimationBesprechung/Interview
Multi-Tier-Architektur
Transkript: Englisch(automatisch erzeugt)
Hello there, we are back. So yeah, I hope you have a great lunch. And I hope that you know, yeah, you're enjoying so far. And actually, I had a lot of food left over from yesterday's cooking show. So that's my lunch. But
anyway, when because you all of you are refreshed, we will have our next speaker, which is Vinicius. Did I pronounce your name correctly? Yeah, that's okay. It's Vinicius. It's changed according to the country. Yeah, yeah, it's difficult for me. Like, you
know, I only speak English and Chinese, of course. But yeah, so, um, so you're going to talk about the anti patterns in Python, which is, yeah, very interesting. So I will let you take us away, then. Right. Let me just reset my timer. All right, let's do it. I've been presenting
now about Python anti design patterns, specifically what we should not do with our code. For anybody who doesn't know me, my name is Vinicius Gubbiani Ferreira. I'm a senior back end developer at ASEAN Technologies for about three and a half years now. ASEAN is a serverless and edge computing
company dedicated to making the web faster and safer. I recommend you guys check our website after this presentation. Recently, I started working with the quality assurance area for about three months now. And also I'm an open source contributor. I work on a daily basis translating the Python documentation for Brazilian Portuguese. And I
also love craft beer and riding a shared bike around the park. Both, I believe, are environmentally friendly. So our schedule for today for this presentation is pretty simple. It's mostly the
motivation for this talk. Talk about generic anti patterns and then specific Python anti patterns. So about the motivation is mostly to help you guys reach the next level for anybody who is starting with Python or consider themselves intermediate level Python programmers to improve their code, to make it easier to maintain, to refactor, to read and to speed up code reviews.
We have several guidelines such as PEP-8, PEP-20, PILINT, that for some time when we have, for example, an emergency situation, we have to bypass, to ignore, and they don't cover everything that is
necessary to have good code. So this presentation is mostly trying to expand beyond just our basic guidelines. So speaking about generic anti patterns that apply to any specific language,
not just Python. So before anti patterns, let's quickly recap into patterns themselves. So what exactly is a pattern anyway, or a design pattern? It's mostly a common solution to a recurring problem. It happens at least three times with different teams without any contact at all among them, ends
up being widely adopted, can be considered as a convergence methodology, and is also very reliable and effective. The anti pattern, on the other hand, looks great when we start, until it's not anymore. When you
usually start with an anti pattern, it all looks like an awesome field filled with beautiful trees and flowers, when things suddenly can go wrong very bad, very quickly, like if you're actually in a maze with monsters. And sometimes it often causes more damage than the original problem itself. We started to wonder, maybe we shouldn't have done that after all.
They usually belong to one of three large categories that are proposed by this book, which surprisingly was one of the few books I could find regarding anti patterns. I recommend you guys check it out.
It's also in the references or the slides. The categories are development, architecture, and project management. Since we're not going to be able to cover all of the anti patterns, the book have lots of them. I pick just a few of them over here for discussion.
And let's start with Bolt Anchor. Bolt Anchor, it's usually a piece of software that serves no purpose at all. I actually knew this anti pattern and refer to it as zombie code, which is code is actually dead and is not doing nothing except trying to eat your brain out, figuring out what the code does.
And the answer is nothing. The book proposes several approaches for correcting those anti patterns. And for this one, the solution is to get rid of the code as soon as possible. And I know it's actually easier to say than do, but that is a solution. This one, lots of people
probably heard about it is this spaghetti code, which is software that is very untangled with no structure or clarity. And it's hard to work on that. And even on projects where there is a lead developer, core developer, or a single developer, if that person stays
away from the code for some time, for example, go on vacation, then you have a hard time into trying to pick up the flow of the code. And the solution for this one is to refactor, to clean up, to organize it. We also have the God object, which unlike the spaghetti code does have some structure, but unfortunately it doesn't mean it's a good structure.
It's very easy to spot the God object because you can see it from a thousand miles away. Usually it's like a project with a single file, or maybe a class with three hundred, three thousand, four or five thousand lines of code.
Or a little method that you crafted so carefully, but unfortunately it now has three hundred or four hundred lines of code. So the solution for this one is to split it up into less responsibilities that are easier to maintain. And speaking about responsibilities, everybody heard about vendor lock-in. This is a pattern
when our code, our solution, our product relies too much on a specific vendor. And for any specific reason, we decided that we don't want to use that vendor anymore. Let's say the company is going out of business or the dollar currency just skyrockets and we don't want to transfer our costs for the clients.
Then if we're actually trying to remove that code from the third party and didn't plan accordingly, then we are probably in for one hell of a party. It's going to be awesome. And the solution for this one is usually to place
an abstraction layer in front of the third party solution, like an API for example. Then we only have to care about the interfaces themselves and we can literally just change the engine under the hood and that's it. We don't have to care about anything else. We also have cargo code programming, which is pretty much like following a solution or a
piece of code blindly and not understanding what does it do or how does it work. I'll also refer to this one myself as stack overflow programming, which is something like, hey, found an also come to the internet that just solves my problem.
How does it work? Don't know. Don't care. Move along. No, we should not do that. We should actually aim to understand what any code that we're using does. Everybody heard or will heard once about premature optimization. There's a famous quote that says it's the root of all evil.
And in fact, 97 percent of the time we don't have to optimize our code, but we can't pass those 3 percent where we do have to optimize the code. And the solution for this one is to plan our code, to design, to implement. And after that, profile, benchmark and load test it.
Then the 3 percent that have to be optimized will show up. If we optimize prematurely, then we're going to make our codes unnecessarily complicated. We also have magic numbers, which why do we think they are magic? Because they seem totally random.
And if we actually change them for any specific reason, since there are no explanation why we pick up those numbers, then we're going to have too many weird errors and problems. So the solution for this one, there are two possible solutions, either a comment explaining what that number does,
or better yet, just define a constant in a constants.py file and use it where the constant was. And finally, we have gold plating, which is to continue to work on a task beyond the point that it doesn't deliver any visible value to the customer or to the company.
And the solution for this one might be annoying for some is meetings. Meetings, meetings, meetings. With your boss, with the product owner, with the client, with your fellow peer developers. For some people it might be annoying, but it turns out during pandemic times it's
actually good to speak to other people every now and then, so we should try that. So, moving on to Python Specific Antipatterns. There's a book with a very good descriptive name. It's The Little Book of Python Antipatterns. It has awesome work created and published under Creative Commons license by Quantified Code, a German-based startup.
Since it uses that license, we can download it and read and distribute it for free. The link is available at the end of this presentation. And just like the other book, it suggests six large categories, which are correctness, talk about all the things that are shouldn't break our code.
Maintainability, everything that will give us a hard time to change the code in the future. Readability is all about making the code easier to read and to understand. Performance is everything that will slow us down. Security is all about threats. And finally, migrations is all about upgrading our Python package versions.
So again, there's not going to be enough time to discuss all the Antipatterns, so I picked the ones that I thought were most relevant. We'll be focusing mostly on correctness, maintainability, and readability. Let's start with this one, which is probably the worst Antipattern I've found.
There's a specific link dedicated to just this Antipattern at the end of this presentation, which is to not specify an exception when we're using try-except. I actually knew this one a long time ago as Pokemon catch exception, which is to catch them all.
So let's assume it's OK, which is not to catch all the exceptions at once with a single except. And what we should definitely not do is just pass, move along, like if nothing happened, everything is fine. I just recovered from an exception. No, we should definitely not do that.
We should actually log what's going on to help other people understand what's going on. Otherwise, clients might be in trouble, and we're not aware about that. Preferably, we should try to catch specific exceptions instead of the generic ones.
Ignoring context managers for handling files, when we're opening files for read and write, we can have any error with our Python code. For this example, I'm forcing a division by zero, on purpose, of course. And if we didn't write the data to the files, we might either lose data,
or even worse, get the file corrupted, and we're going to lose all of its data. So to solve that, we have context managers, which is the with instruction on the right. And what does it do? If we have any problem at all, like an error, then the context manager will still call
the dunder exit method, which will be responsible for writing to the file, closing it, and free any resources, the memory itself. One bad anti-pattern is to return more than one variable type. In this example, we're returning none and a string in the same time on the left.
This usually and sometimes gives birth to code that is hard to maintain, to test. So a better idea is just to stick with a single variable type. And if we can't do that, maybe just raise an exception for the other cases.
Accessing protected members from outside the class. Python is a very permissive language. It allows us to do many, many awesome things. But sometimes a good thumbs up rule is just because we can, doesn't mean we should.
And in this particular case, we are accessing protected members from outside the class. The correct way, since they are supposed to be protected, is to implement public interfaces like the gather and setter methods. Assigning to Python built-in functions. This is bad, really, really bad. Why so?
First of all, because we're not going to be able to create new lists using the list built-in function from Python. And second, because it hurts our debugging capabilities, I myself use the Python debugger, the PDB.
So I stumbled upon code very similar to this, which I couldn't get more lines on top and below the current line. So it was hard to debug. And what we should do in these cases is to give more meaningful names that describes our variables, our methods.
That is the correct approach. Mixing tabs with spaces is also a bad anti-pattern. This is actually very easy to fix with our IDE. So I'm not particularly sure why people most times prefer tabs. Sorry.
Maybe they forget. I don't know. And the reason we should actually use spaces is due to the PEP8 convention that OBLIUS is, well, it's not OBLI, but enforces, requests us to use spaces. And if you're not actually satisfied by just, ah, there's a rule that says I should use spaces, who cares?
There's another good rule to convince you, is that developers who use spaces make more money. And if you don't believe me, there's a Stack Overflow survey saying that from about three, four years ago. And this is not fake news. I'm not making that up. The links are on the references. You can check it out at the end.
So if you don't want to use spaces, then that's okay. More money for me, more money for everybody else who uses spaces. The bottom line here is that using spaces pays off, literally. So another interesting anti-pattern is to not use else when appropriate with a loop.
This is kind of tricky. I myself tend to reject this one a lot because it doesn't seem natural to me that a for loop have else statement, but in Python it does.
And if you are also like me, you feel distressed when you see something like this. Maybe just stay away from the code for a while, like five minutes, ten minutes, go grab some coffee, look out the window. It's probably going to be a beautiful but cold day like today. But when you come back, you'll start to notice that the cold day, hey, it does seem better.
You don't have to use an extra variable to check if you found what you are looking for. And the catch over here is that we actually need to use the break statement. Otherwise, it will enter the else part of the code.
Do not use get to fetch data from the dictionary. Python uses the approach of easier to ask forgiveness than permission instead of the look before you leap like Java and C. So the best approach for this scenario is usually to go for the get method because
then you don't actually have to check if a key exists or not in the dictionary. Python will automatically check that for you. And the default value that is returned is none. And if you are not actually satisfied by that, you can change the default value.
On the right, you can see that it is returning an empty string. Using wildcard imports, so ideally, we should import only what is being used. And I know I said ideally. I'll get to that in a minute. Why that? Because among other things, using wildcard imports might make module names to crash from one library on top of another.
For example, if we use from iSync.io imports timeout error, timeout error is a very common and generic name. It exists, for example, in the requests library. So it's probably going to get you into trouble, into trying to figure out what's going on.
And I know I said ideally before, we might be using a lot of modules, methods, and information from a library. If that is the case, we might be able to import, for example, a submodule instead of the whole module.
To use the global statement, this is bad. It's okay if you use it for a quick script or something like that. But for production code, it might not be the best idea ever. The variables are limited from Python due to the method scope or even the class scope.
So the global statement is a bit something like, hey, I have these variables that are not inside my scope, but trust me, they are outside. Just go look for them. And it works, but if you have lots of methods changing these variables,
especially with something like async.io, then you're probably going to have problems very quickly and very badly, maybe. And the solution is usually to encapsulate the variables into classes, then they are safe.
Using single adders to name our variables, once again, this is bad. Really, really bad. And for two reasons. First of all, like I mentioned before, in the built-in Python function, it hurts our debugging capabilities. This is actually true code. I stumbled myself onto something very similar to this twice, and I had a hard time into debugging specific methods.
So we should actually use better names to improve our names description into our code. After all, everybody have a name. My name is Vinicius. My name is not V. And I'm not going to name my daughter or my son like A or B or ALX or temp or any, I don't know, letter in the alphabet.
So let's reinvest some time, not waste, invest some time into proper naming our variables, because we're also going to debug this code in the future, not just only us, but other developers.
So let's help everybody out. Comparing things to true the wrong way, the code on the left works. But if you're actually checking for the boolean type specifically, then we should probably use the is statement.
And a better explanation from this one, I have a few slides over here showing up with examples. We can see that one is true is actually false, which is maybe for those who never use the is operator, surprising. One equals equals true is true, just as 1.0 equals equals true.
And on the right, I notice about the memory position for all these variables that were used. This is actually a good question for like interviewing processes, because in Python, we don't have to deal that much with memory management.
So junior and sometimes even midterm developers don't know about the ID method and memory management. So those who never heard about memory management in Python, I recommend you guys read a bit about it to understand how exactly does this work. It's very interesting.
To use type to compare, we should actually use the is instance preferably. So that is because is instance checks for inheritance. That means a derivative class. This is also an instance of a base class too. While you type, just check for the specific class.
And probably not using name tuples whenever it is possible. I hope everybody heard about name tuples at least once. If you don't, then I recommend you guys stop this talk and this presentation right now. Just kidding. Please don't. We're almost at the end. But name tuples are awesome and you should check it out.
Unfortunately, they are available just from Python 3.6.4, which is too bad for those who are still with Python 2. And the advantage for them is that when we use indexes such as 0 and 1, too much further in the code, we might have no idea exactly what that means.
But when we use the dot operator like name dot first and name dot last, then it's very clear and easy to see and to read what exactly the code is doing. So that was about it I had prepared. Here are the references that I mentioned earlier during this presentation.
And you don't have to notice them right now. They are available on the slides. They were already uploaded. And I hope you guys enjoyed a lot. Like we say back at Asia, move to the edge. Obrigado, muchas gracias, vielen denkin, arigato. And I'm still working on the Chinese and Russian. I can't pronounce it yet.
If you can, please do. If you have any questions at all, feel free to contact me into any of these means or on the platform for the convention. Thank you so much for the Europe Python Society for the opportunity to present again.
Yes, xie xie ni is the thank you in Chinese. Yeah, there you go. That's in Mandarin. It's not my mother tongue, so I take it with a grain of salt, people.
Thank you so much. It's very impressive. Also, it's fun to know that we have so many anti-patterns in Python. I've never heard about that name triple, to be honest. So I have to leave, maybe. Thank you. But yeah, that's very interesting. So let me see if there's any questions in the chat.
Yes, there's one. So is there any linter that checks for most or any of these by default? So I'll also put the questions. Let's see. I believe by lint, we'll probably check for a few of them, like single variables or into single-letter variables.
Let's see what else. Maybe going beyond the recommended method length or variable length. That's a bad thing that linters, they cannot tell us all of these anti-patterns. They might catch a few of them, but pretty much just a few of them, unfortunately.
Yeah, I think in the worst case, you just have to sometimes, you know, I have a bad practice of switching off the errors from the linter. So maybe if you really have to, you may have to do this.
So among all these things that you introduced to us, if you have to pick one that is most important and then we should really pay attention to, which one would you choose? You mean the anti-patterns? Yeah, the anti-patterns. I'll probably stick with the variable name because that bugs me a lot.
It makes me distressed, like making the code harder to read and to understand. It's very annoying, at least for me. But a few of them can also be easily avoided, but usually when you get practice like the God object, it's very common for anybody who's starting to make large projects
with so many lines of code. And like I mentioned, if you stay away from two months away from your code, you're going to look into that or maybe look into that in the future, like five years from now, and you're going to think, Oh God, this is not good. Who made that? Crap, it's me.
Sometimes I'll be like, who write this line at Git blame and see my name? I was like, Oh God. Nothing good comes from using Git blame. It's either going to be you or a person who already quit the company. Yeah, so it's just don't embarrass yourself. So yeah, that's very, very interesting talk.
So if there's no more questions, I think I would just let you relax and then maybe we can play some ads in between. But yeah, thank you so much. And then if people have questions maybe later, then of course people can find you in the chat and then join your Python.
And I'll also pass it back to Nick to be the section chair as well.