We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

API Design is Hard

00:00

Formal Metadata

Title
API Design is Hard
Title of Series
Number of Parts
8
Author
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
"Have you ever been really annoyed about some APIs of open source libraries? There's are good reasons to be. Most libraries are devoted to backwards compatibility and are not able to change content anymore. Coming from writing a Python library called "Jedi", I can feel with you. I have hated my own APIs more than enough. I have learned the hard way and want to tell you a few things I would have loved to hear years earlier!"
MereologyCodeArchitectureSoftware testingInterface (computing)Application softwareScripting languageNeuroinformatikOpen sourceBilderkennungSource codeComputer programmingGoodness of fitCode refactoringPlug-in (computing)Text editorImplementationComplete metric spaceBitSocial classYouTubeRevision controlSoftware testingLibrary (computing)PiStandard deviationCodeComputer architectureMetropolitan area networkVideo gameTunisInterface (computing)Design by contractMultiplication signRoutingSoftware developerProjective planeCASE <Informatik>Scripting languageProgrammer (hardware)Medical imagingSystem callConsistency2 (number)Reading (process)WritingGame controllerWeb 2.0Lecture/Conference
ConsistencyStandard deviationPauli exclusion principleJava appletRange (statistics)Complex (psychology)Type theoryMultiplication signLibrary (computing)Revision controlPauli exclusion principleDecision theoryJava appletSocial classFormal languagePoint (geometry)Complete metric spaceWordWebsiteData typeConsistencyProgrammer (hardware)SpacetimeRight angleStandard deviationTap (transformer)Projective planeNumberIntegerLine (geometry)View (database)Open sourceInterface (computing)Exception handlingParsingState of matterSerial portSoftware bugSoftware developerGoodness of fitArithmetic meanCodeLecture/Conference
Variable (mathematics)Attribute grammarMountain passEmailSoftware developerSpacetimeRevision controlFilm editingLibrary (computing)Electronic signatureVariable (mathematics)DecimalAttribute grammarFunctional (mathematics)Java appletRight angleCategory of beingCode refactoringSystem callHecke operatorInterface (computing)Bit rateMixed realityTwitterComputer configurationString (computer science)CASE <Informatik>Formal languageSocial classPoint (geometry)ParsingWordResultantParameter (computer programming)Object (grammar)AdditionRaw image formatCoefficient of determinationNumberIntegerLine (geometry)Multiplication signOpen sourceDifferent (Kate Ryan album)Lecture/Conference
String (computer science)Service-oriented architectureInterface (computing)CodeCode refactoringInteractive televisionSystem callRevision controlSpacetimeFunctional (mathematics)WritingService (economics)Multiplication signModule (mathematics)Mathematical analysisParsingHierarchyComputer configurationGroup actionSocial classCategory of beingCode refactoringString (computer science)Message passingInterface (computing)Software bugCodeLetterpress printingSoftwareLibrary (computing)Proper mapPoint cloudBeta functionLecture/Conference
Cartesian coordinate systemTwitterSoftware testingDiagramNormal (geometry)Library (computing)Line (geometry)CodeRegulärer Ausdruck <Textverarbeitung>Test-driven developmentInterface (computing)WritingSoftware developerMultiplication signSoftwareStaff (military)Right angleCASE <Informatik>Scaling (geometry)Point cloudVideo gameSoftware maintenanceBitFunctional (mathematics)Software bugTwitterPauli exclusion principleProcedural programmingPoint (geometry)Module (mathematics)Java appletSystem callLecture/Conference
Transcript: English(auto-generated)
I'm really excited that there's like more than 100 people in the room in Switzerland that are doing Python because this is really not like what I know and do usually like I've done like two years of open source work now and
it's been mostly in my room and on the balcony so that was different it was just me and not like 100 people so I didn't even expect that there's so many Pythonistas around here and some of you might have come from further away but
still. As I said I've done mostly open source work in the last two years and then I went to Afghanistan with an NGO doing some computer work as well so I
created JEDI and JEDI-VIN. Who knows JEDI? Maybe, maybe. Okay, so like a few of you. JEDI is an auto-completion library. It has grown quite a bit.
It's like 2,000 stars on GitHub. JEDI-VIN is just an implementation of it so you can use if you don't use PyCharm like the best way to get auto-completion is JEDI. There's like plugins for Emacs and Vim and all the other editors that
you shouldn't use. What I really started to like when I wrote JEDI is I like writing clean code. I like writing good code and it's mostly not possible
because I suck at it but when I start to refactor things, things get cleaner and like I also started to think a lot about what I'm doing like I started to think a lot about the API and like also class design so this has
helped but still like JEDI kind of sucks even after refactoring or especially API. Some things about it are just not that good and this talk is
inspired also by Raimund Hattinger and his talk and Alex Martelli like especially the first talk API design lessons learned is really good talk. You can look it up. Don't take the YouTube version there's another version online.
It's a really good talk because it explains a lot about how standard library stuff grew and what is good about standard library stuff and what is not so good. So let's start. If you want to write good code
there's like a few things you can do. You can think like you write clean code, you do good architecture like good API design, you do
testing. Who here uses pytest and talks? Okay that's way too few. I don't know if there's a talk today about this. I don't think so. So there is? Okay
so this is good. This is actually really good. Yeah like you really should be using those two to test your stuff. If you don't test at all like yeah go to this talk. This man is saving your life. And then there's also documentation
which is really important and there's a really good tool as well for this. It's called Sphinx in Python. Also use this and you're not doing yourself a favor if you don't. I can understand if you don't use it for a web project or
something but still like it even even in those cases it might have. Like it might be really good. Code reviews are something else that can improve your code drastically. The problem is that like I control pull requests myself but
nobody controls what I kind of push. So that's where I sometimes suck. Like you cannot or you cannot be controlled in a company where you're the only developer or you're like two developers and one is like the intern.
So API stands for application programming program or programming interface. That is not really important. We will just be talking about
interfaces in general. So in Python what would you say? Are there interfaces? Who says yes? Who says no? Nobody says okay. Like there are kind of
interfaces. Like you can use abc.abcmeta. It's something that almost nobody uses. But it's really like if you want contract style interfaces it's what you can use. They're like, as I said, a lot of people don't use
them because you don't need them. Like Python is mostly duck time. But they're still great for certain usages. So if you haven't seen it yet
it might be worth looking at it. Especially if you're not a total beginner. Then let's start about talking about bad APIs. The worst API you can have is having no API at all. Because the thing is you still have an
interface. You still have something that interacts with the world. And if you interact with the world it can be like an image or a website. But you're still out there and some programmer can read it. Like my father who's not even a programmer
he tries to read wind data from an image. He just has image analysis and stuff because he wants this data. So there's still an interface. Like you can read this. You can get it somehow unless it's like
scrambled text. But even then. The second one is like you can go for both solutions. Like one of the things a lot of people do is like okay like I
have this idea and I have the other idea. And then I, yeah like one developer uses this one and one wants this one. So I'm just going to do both. So in Cheddai for example there's this command name names that lists all the names in a
Python source file. Like you could have done the first one and the second one. Because script, you don't know what script is but script pretty much is just a like in Cheddai like that's perfectly reasonable. But what I
decided is to do not the second one and only the first one. For some other things there's only the second one like calls on script. Because it makes sense there. I will not explain the whole API but so never like
this should not be possible. You should be able to call it one way and one way only. Like this is the Python philosophy we're not talking Perl here. So there's like one way. The third thing is inconsistency. So there's
not just like standard violations and this kind of stuff but mostly it's about deciding to go with like overnaming all of our API in a
certain way. But I'm like we're doing exception here. You shouldn't be doing exceptions like it should be very consistent. For Cheddai for example we decided to go with with nouns like completions and go to
underscore definitions. And while you don't know these words the point is you go with nouns and then you go with nouns everywhere. So
this is consistency. And at this point like one of the things I realized when I went to sleep yesterday is that you're probably like most of you are probably not doing APIs because APIs is only
something you do when you write a library but a library is something like almost nobody writes like open source people write libraries. But then again like consistency and good API design and like all of the stuff I just spoke
about is very important for the casual programmer as well because you're going to design classes at least. If you don't design class if you just do like very simple Django stuff and are mostly on the on the HTML side doesn't
mean like if you do Django you're doing simple stuff but just like if you like there's there's these websites that like it's just like return one plus one and okay that's that's like all my coding. So if you do that you might not be doing classes but once you're doing classes this stuff
gets kind of important because you're designing and you have an internal API at least like this this you still have an interface that you work against when you have a class. So to get back to inconsistencies like there's a
consistency that I think is kind of important in Python and it's called PEP8. Most open source projects follow this standard you can violate it in certain ways like if you think that tabs are better in spaces yeah okay but
be consistent like use tabs then. In general what I would not be doing is like this kind of Java style because almost nobody does it in Python you're doing Python don't do Java if you do Java like write it like this but if you
do it in Python it's just yeah there's this in some other world very very far away little puppy will die because you do this. It's not fair yeah but still
it doesn't mean that your whole API sucks like beautiful soup for I don't know who knows it but it's it's it's it was it's a good library and they
changed their naming convention to to a more Pythonic way but beautiful soup for you was still a really good library and the API was great nobody complains about that it's just it wasn't a standard of Python okay so when you
let's let's let's now move more to the solution side you have to brainstorm your API and in general when you when you start even when you start writing a class think beforehand about what this class is going to do
and what it's not going to do and think about what you need outside and what you don't need so all the stuff you need is going to have normal names
and all the stuff you don't need is going to have scores and that have an underscore prefix out on the outside from the from the outside view I'm going to talk more about that underscore stuff later and when you when you do
then think about data types like API API API should have very simple data types this is something from pep 20 peps by the way for those who don't know it
there are Python enhancement proposals and they're a way of driving Python and like Python is you is mostly defined by peps so that 20 is the center of
and it's just like sentences like simple is better and complex and beautiful is better than ugly so this is just one of them simple is better than complex like if you can return an integer to a very simple question like get line number return an integer don't return a class that does something
crazy so this like like the same is is probably true for IO should be like especially in API's don't use pickle because no other language can read
that it's just a Python thing so if you need to serialize something in Python and you read it again that's for kind of internal state like I do myself but that's for like some crazy parser state that's great but it's not
for communicating with the outside world and you might think oh yeah I just needed for Python processes but then like some other guy comes along and he uses it differently and one thing you also have to factor in in your thoughts
when you when you think about designing classes is performance and I like I think optimizing for performance especially in API design is very dangerous because you you tend to kill your API and in one way or the other
but there's like there's a a new async way of like Python 3.5 introduced async support for the language so this for example might be a very good way
of doing API design with asynchronous thinking but that's not true for all the previous versions so think about that like asynchronous is
great if you need it if you need it for performance but usually you don't but just think about that there's maybe like if there's a one second delay or two second because you have your API well you might need to do asynchronous
ok and this is really important you should be conservative in your API decisions and if there's one thing you should learn from my talk it's just this like don't make things public that shouldn't be public because you're gonna
regret it I regretted it's like like I still do this and like mostly when pull requests comes up come up and like people say oh we like this would
be such a nice feature I say like ok but are you sure that this is the only way of doing it and like am I sure and if I'm not I just I don't merge because when you when you're not conservative when you when you just add to the API
on and on you will have to maintain it and that sucks like I have like even when doing just open source work like this bug tracker eats so much of my
time and I don't think paid customers are better than open source developers so yeah I guess in general if you just keep it down a bit and like you
just say I'm not gonna I'm not gonna go with the temptation of this nice feature I'm just gonna have my four old boring calls my four old functions and they're great and I'm going to improve the inside first and once that is really good I'm going to make something new
ok this just fits right in and for anybody that doesn't know about this convention it's it's it's very normal for python developers to use underscore
and for kind of protected variables or even private it's kind of a mix because there's no protected and private in Python but this is how you how you say this variable should not be accessed from the outside just use it all the
time like if you don't need to use it from the outside it's a really good way to use it refactoring support in Python sucks and so it's even more
important than for example in Java I think because in Java you could just refactor in Python you're refactoring with a tool like Jedi or PyCharm and you just don't know if it's really right because it's like I don't like I don't refactor with my own tool to be honest like I have a certain
refactoring support and like I'm still not sure if it works really well so I don't trust my own tools there and that's not that's not very good because my like my tool does refactoring and like it's like one
of two tools that do it kind of well so yeah you choose and there's also underscore underscore variables and there are they are something kind of
different and I wouldn't recommend using underscore underscore too much because it's really a hack in Python itself it's like a hack in the parser like Python rewrites underscore underscore to class name underscore underscore name
like this and it's kind of crazy but in certain at certain points it makes sense so if you really need something private really only for that class and you're not using like get addcher and you're sure you will never use it
so for very simple cases you can use it but I I would really prefer you use underscore okay there's naming conventions in Python like that I talked
about but there's also kind of naming conventions that most languages use like knowns are for attributes verbs are for methods but like really nobody wants to once go get underscore in Python like I know where this comes
from it comes from the place where like Java Java developers come to Python but at the same time get underscore still has like it's still a good thing if you use it carefully if you like you shouldn't be using using it
always but it can be good like but then there's there's like these library like these Python libraries and they sometimes use it and they sometimes do and and the only thing like in Python that we we really know is that
like all the libraries are consistently inconsistent about using get underscore and not using get underscore and using verbs for method and not using verb verbs for methods so for example requests uses just text as an
attribute and like JSON as a function and the problem is in Python like you might even use dir the function dir on that object and you will see
text and chasing do you know what it is like a function or a function or an attribute you don't and that's ok I guess because there's documentation and you will find out but still like that that's something to think about and I
still haven't decided if I'm like I'm also consistently inconsistent about this I also sometimes use nouns and we're for verbs and the other way around so it's just something you might want to think about for your API or for your
classes in general and one other thing in Python that is really nice and not a lot of people are using are named arguments so one very simple example is
the first one you will not know what like this Twitter search call is actually doing it's just a false what you could write instead is like
3 retweets equals false and that's it like this is this is way better and for people that develop exclusively in Python 3 I would actually I would actually recommend you to use the star there because that star will not
allow people to use a like like you will have an interface that allows people to use it the old way but just if you don't use those keyword
parameters like this means you cannot write it like comma 3 comma false you can only write it the second way and you can write it like Twitter search brackets some name but not like like comma 3 because you yeah so the star
like this is a great addition to Python and I really recommend recommend you to use it so let's go over to properties properties in API is also
something that I like and at the same time they can be really dangerous like property for the first kind of function is something great like it's like line
number is something I never ever want to be different somehow that's just an integer and I don't want an option of like two to it in the API but if you have something more complicated like dev doc string like you might like doc
string you might want just a raw doc string with all the white space or you might want a like a concatenated like not concatenated you might want a version that like cuts away white space and all that stuff so they are a
function is better and like you can start very simple because this function like if it's a property you cannot change its signature if it's not a property you can change its signature you can write like that doc string self comma has white space equals false you can add this to it and
if you add it to it that's that hasn't changed the API at all like it it's still the same old API it just has a new option so properties in properties
you cannot add options and functions you can so think about that when you your API and this is true as well for people that just do normal classes and do not APIs because you will not like when you do it with properties you
might need to refactor all the calls to that function if you call it like a hundred times and the refactoring tools suck in Python you will need to go through your code 100 times grab for it search for it replace like 100 calls
and then like it's just annoying I've done it I've been there so we're slowly getting to the end of the talk and one thing I want to talk about now
transitions transitions are something that you will not like get away in your API this is something that is that is different for people that never write an API if you do a transition with your own code you just refactor you don't
care about the outside world for an API this is different you need you have versioning and you cannot just change your API the way you want there's like semantic versioning for people that have never heard about it it's a it's a
convention about writing versions anyway so there's like you can you can write versions in a certain way like you can write it like 1.1.5 and that
would be like starting from version 1.x it's the first non beta version like semantic versioning looks like this so this is way too small this is this
is a beta version this is that this is a minor update in a beta version this is
a major update in a beta version and if you go to 1.0.0 this is your first version so semantic versioning can be great for like describing your API but let's say we want to jump from 0.7.1 to 0.8.0 we cannot just change our API
because people want to update their software and they don't want problems because that fills our bug tracker and we don't want to fill bug tracker
it's not we care about their problems we care about not a full bug tracker so what you can do is is you can deprecate stuff like in Python 2 to 3
a lot of things even before that were deprecated actually if you if you look at it and if you start like like you can print warnings with Python space
dash W all and W is capital letter and if you do that you and you start it you will see a warning in Python 2 that the module imp is deprecated so
the same is is like but you can also produce these warnings yourself you can just say warnings.warn and then like some message deprecation warning this is proper deprecating of stuff in Python like you you write a function
you you add to the function something in the doc string that says well this is deprecated and then you also say warnings.warn some string deprecation warning and this is going to be printed in Python if you use it with Python with the command
I just said like I can write it down it's called Python all okay like this like wall
and so what what what this is above there in the doc string is like it says dot dot space deprecated and then a version this is a call like it's not a call but it's it's a way
of telling Sphinx the documentation that this call is deprecated and so this is the beautiful way of how Python and its ecosystem interacts like this is a good way of telling a user when he reads your code that it's deprecated but it's
also a way of telling the documentation well this is this is not you shouldn't be using this anymore and so it displays it one of the things that I that I realized while while
developing APIs is that you can get APIs right when you really think about them like even in even if you're writing internal tools like one of the things that that Jeff Bezos did
like the the Emerson CEO is he he were like 2002 like a long time ago and he he went to to serve like to service interfaces so like and he started using services only in his company
and so I said like anyone who doesn't do this will be fired thank you have a nice day and this is kind of crazy in a way to switch from the old way so drastically but at the same time it allowed them to just because they were doing this it allowed them to just make their
cloud like to put them in a public space and say well you can use this now because it was always designed to be used by people and to be used by
pretty much anyone and not just it wasn't just like this internal company thing that had like a thousand dependencies everywhere like it was designed from day one to be used by a customer and if the customer is internal that's fine if the customer is external that's
also fine and this is the same for your APIs you you don't know who's going to use your code in like five years but if it's well written and if it's only that module out there in your hierarchy like they can take it out it doesn't have any dependencies
it's good like for example in in my library in Jedi there's a parser and you could pretty much strip it out like replace one to a few things and you're fine because the parser is like
it's a parser it shouldn't be like have some dependencies into code analysis loose coupling okay so let's let's wrap up use what you learned in API design for your internal APIs
I've talked about this a lot now so this is this is really my conclusion and be conservative like don't make things public that shouldn't be public you can even do this in modules like I
like when I when I define modules like stuff in modules I like define functions that I don't use outside with an underscore it's not just classes and you should be able to go public with a sub package without refactoring
okay oh this was too fast okay so this was pretty much it and thank you for listening I'm working like I will start to work on Tuesday at a company called cloud scale and this is like really first job in like a long time and we're actually
hiring still and so if you want to join us and I'm David halter on github and jidya underscore ch on twitter so I'm open for questions so use you talked about the travel
like interface travel like naming thing and I'm wondering I mean it's mostly a personal preference I guess but how do you think it's best to handle this when using a library like Qt for example which is well you have two clashing name conventions then with PyQt you
have the Qt name convention which is the camel case one and the python pep8 naming convention so what I did is having my stuff lower cased and pep8 and having the interfaces to Qt and things I'm overriding of course camel case but I'm wondering what your opinion on this is
yeah I mean I have the same opinion like write stuff in pep8 compliant like don't be extremely picky because then you're just looking at pep8 stuff while all the stuff that actually matters and that is actually hard like you will just ignore it but in general like
write pep8 software it's yeah like why would you why would you care about the library itself it's just calls to that library and the function names might look a bit odd but yeah thank you
hello is it on okay I wondered if you could say a bit more about long-term maintenance of libraries and apis because honestly it sounds a bit scary like how much of
your life does that take up it's okay because I didn't like if you don't work it's okay no I mean to be honest it's it's something that a lot of a lot of people are scared of
and also a lot of people get burned out from because it's like it takes probably like
now a day and I could probably reduce it to like four or five hours a week but that's that's not improving the software that's not like maintenance that's just bug tracker and like a little bit of bug fixing like the worst stuff but yeah so it takes a lot
of time but it's also it's it's very interesting stuff so like it's a hobby yeah yeah when is when is the best time to start thinking about your api so you start with with
some just code on the side where you have functions for yourself then you refactor it to a separate module on your project and at some point you start having a library for it so it's kind of fluent when is the best time to actually sit down and and you know
design it um I think I think like I think of coding as a very like you know in a very agile way or that's how I do it like you said like it's just a fluent it's a procedure like you change things and you iterate and like that's normal but when you when
you think about an api and when you think about an api especially that has to go public like you cannot change it so easily so I would really recommend you like when you when you when you stand when you're in front of a problem design your api's mostly the api is not the
complicated thing like it's just write down like just brainstorm and like write down your uml diagrams and whatever or like something similar I I'm not even capable of writing uml diagrams so but like like something similar in in that kind of way and you will
you will see that that tremendously helps that's that's my take thank you I also have a question and some people recommend that people write the interfaces in the tests
first and then start implementing the interfaces or the test first development and it's quite hard to do because you need a lot of um it needs a lot of effort to actually do but would you recommend it did you do it in the past it's it's it's funny that you would ask that because when when we first met he uh like
we we worked at a company together and like I was the one writing like 1000 lines of regex code like literally just regex calls 1000 lines and like I had like zero tests back there so you can imagine the mess
but like yeah writing the tests first can be something that can help but like I I don't like writing tests so much like I I like to think about to think about the problems first
it's probably a good discipline like I write tests like I do test driven development by the way but writing like all the tests first for the API that's probably going to be hard and you're not going to pass the tests for a long time probably so I don't know I like
writing incrementally but thinking about the API first and kind of creating it without actually using all of it in the beginning okay we are out of time so thank you Dave thank you one more time