We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

FastAPI Internals

00:00

Formal Metadata

Title
FastAPI Internals
Title of Series
Number of Parts
131
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
FastAPI became one of the most web frameworks in Python. It has an amazing documentation, and easy to use API, which made it very popular. It's easy to start, and as a developer you have a lot of power on what you can do. But... How does it work internally? In this talk, we will explore the internals of FastAPI. We'll explore the dependency injection system, what are the benefits, and limitations. We'll also see how the routing system works, when the middleware stack runs, how the request and response are handled in detail, how the OpenAPI schema is generated, and the differences between async and non-async endpoints, and how WebSockets fit in the whole picture. Furthermore, we'll also see how the dependencies Pydantic and Starlette help FastAPI on its job. At the end of this talk, the attendee will understand what's underneath of this very popular package.
Intrusion detection systemSoftware frameworkOperations support systemSoftware maintenanceExtension (kinesiology)Server (computing)Interface (computing)Gateway (telecommunications)Mobile appImplementationStandard deviationRepository (publishing)CASE <Informatik>FreewareStorage area networkRevision controlClient (computing)Numbering schemeTrailEmailBitClient (computing)Server (computing)Revision controlMessage passingEmailLengthCartesian coordinate systemFile formatCodeEvent horizonProcess (computing)Dependent and independent variablesMereologyTerm (mathematics)Exception handlingObject (grammar)Connected spaceError messageParameter (computer programming)Level (video gaming)Software frameworkSoftware maintenanceValidity (statistics)Cycle (graph theory)Mobile appWeb 2.0Open setElectric generatorInformationMaizeHypercubeMetropolitan area networkSynchronizationType theoryTelecommunicationMultiplication signComputer animationLecture/Conference
Client (computing)Stack (abstract data type)MiddlewareCartesian coordinate systemServer (computing)Exception handlingError message9K33 OsaRoutingInjektivitätMiddlewareRootDependent and independent variablesCASE <Informatik>ImplementationError messageServer (computing)State observerEmailClient (computing)Exception handlingPoint (geometry)Cartesian coordinate systemAxiom of choiceDataflowInformationEndomorphismenmonoidCodeArrow of timeForm (programming)Event horizonRight angleFlow separationMereologyResultantNumberSystem callCache (computing)Cycle (graph theory)Line (geometry)Patch (Unix)Parameter (computer programming)Message passingMultiplication signMatching (graph theory)2 (number)RoutingSoftware frameworkStack (abstract data type)Computer animationLecture/Conference
InjektivitätSynchronizationCASE <Informatik>outputFunction (mathematics)SynchronizationServer (computing)CASE <Informatik>Software developerEmailFunction (mathematics)Point (geometry)Client (computing)CodeThread (computing)Proper mapEvent horizonFile formatResultantString (computer science)Cache (computing)System callFunctional (mathematics)Hash functionMobile appLoginDifferent (Kate Ryan album)Social classInsertion lossDatabaseType theoryData storage deviceEndliche ModelltheorieIntegeroutputNeuroinformatikMiddlewareField (computer science)Loop (music)Right angleComputer animationLecture/Conference
Function (mathematics)outputArc (geometry)SatelliteMetra potential methodFunctional (mathematics)EndomorphismenmonoidClient (computing)BitWeb 2.0Instance (computer science)Line (geometry)Slide ruleFile formatSocket-SchnittstelleParameter (computer programming)CodeServer (computing)Point (geometry)Type theorySubsetComputer animation
Client (computing)Hand fanTelecommunicationSynchronizationRepository (publishing)Loop (music)Event horizonBefehlsprozessorThread (computing)Context awarenessProduct (business)Point (geometry)CodeIntegrated development environmentThresholding (image processing)Web 2.0Task (computing)State observerComputer animationLecture/Conference
Design of experimentsGraphics tabletOrder (biology)String (computer science)Poisson-KlammerMatching (graph theory)Uniform resource locatorView (database)Slide ruleRouter (computing)RootComputer virus2 (number)WordLatent heatLecture/Conference
Operations support systemWorld Wide Web ConsortiumString (computer science)RoutingLine (geometry)Uniform resource locatorRouter (computing)View (database)Right angleLecture/ConferenceComputer animationMeeting/Interview
RoutingRootView (database)Arithmetic progressionRouter (computing)CodeComputer animation
GEDCOMSupremumoutputFunction (mathematics)GoogolUniform resource locatorInsertion lossDatabaseSlide rulePattern languageGame controllerOrder (biology)Software frameworkRootSocial classInjektivitätState of matterInstance (computer science)Single-precision floating-point formatGoodness of fitRoutingMobile appLecture/ConferenceComputer animation
Link (knot theory)Endliche ModelltheorieDependent and independent variablesPressureProcess (computing)Software frameworkOrder (biology)Digital filterComputer programmingPhysical systemStandard deviationFunction (mathematics)Virtual machineData modelPredictionFormal languageWindowLoop (music)Ext functorFeedbackMultitier architectureWrapper (data mining)Beta functionComputer fileRootModul <Datentyp>ForceInjektivitätMobile appMeta elementBoilerplate (text)Event horizonProgrammable read-only memoryPole (complex analysis)View (database)WebsiteBookmark (World Wide Web)Control flowSoftware maintenanceMultiplication signSimilarity (geometry)Keyboard shortcutInjektivitätComputer animation
Roundness (object)Lecture/ConferenceComputer animation
Transcript: English(auto-generated)
So, yeah. Is it working? Can you hear me? But I cannot see you, so I guess that's okay. So, oh, it's here. So I'll be talking about the internals of FastAPI. I thought the title was a bit bigger than this. I had some other stuff. Well, it doesn't matter.
I'll be talking about the internals of FastAPI, I'll be giving the full cycle of how things happen from receiving the data from the client and getting to the server, the application, and then going back to the client again. I am Marcelo. I'm from Brazil.
I live in the Netherlands. And what else? Yeah, I work at Pydantic, a data validation package. And I'm also the maintainer of Ufcorn and Starlet. One is a web server, and the other is one of the dependencies of FastAPI.
Hey, man. It's you. And yeah, I'll be talking about FastAPI. And what is FastAPI? Just general overview. Actually, who have never used FastAPI and it's here?
Okay. Five percent. Who is from Brazil here? One percent. Anyway, so FastAPI is a web framework. It has two dependencies, Pydantic and Starlet. Starlet is responsible for the web-related technologies that FastAPI uses.
And Pydantic does the data validation that comes in and comes out from the framework. It also does the generation of the open API JSON that you use to see the huge swagger documentation that people like. It was created in 2018.
And I am part of the community since 2020. Oh, yeah. You can run FastAPI. You need to run FastAPI. You need a web server. One of them is Ufcorn. You just run Ufcorn main app. There are alternatives. There is Hypercorn that's created by Phil Jones.
He's one of the maintainers of Flask. He also maintains the Hypercorn. And there is Granium, which was created by an Italian guy called Giovanni. It's made in Rust. People like to know that nowadays. And yeah, so this is pretty much what I'll be talking about.
How the data comes in from the client, goes to the server, and then goes to the app and goes back. But I'll be focusing more on the server to the application and coming back because that's what's in terms of FastAPI. We want to know how the data comes in, what FastAPI does, and how it comes back.
So I'll be focusing on this part. So how it really happens. There is one spec called U-Asgi that specifies how the server needs to communicate with the application for non-async Python that is used for Django, Flask, and other previous time-async web frameworks.
And then there was a new spec called Asgi that was created some years ago. I think maybe 2017 or 18 or somewhere around those times.
And then it tries to match what U-Asgi but for async. And that's the spec that FastAPI is based on. It uses what is written on this documentation. This is also, as a maintainer, I go to the reference all the time here.
But I guess most of the people don't. Because it's more low level than what people are used. So this is the format of how an Asgi application looks like. And this is actually how FastAPI looks like really underneath.
FastAPI is, if you see this app, it's a callable that has three parameters. A scope, receive, and send. A scope, it's the connection data. So it's a dictionary with all the data from the connection. And the receive and send are two callables to connect, communicate, to communicate with the server.
So you need a way to communicate with the server, receive data from the server, and send back to the server. So you use the receive and send callables. So the FastAPI object also is a callable and has this format.
This is the most simple Asgi application that you're going to find. You might find it very simpler than that, but then it's not as correct as this one. So this one has, you have the content type, you have the content length on the header.
And then you have a small body, which is just a little word. And then we are using send twice. We're just going to send first the HTTP response. Where is it? Is it working? I don't know. So the HTTP.responseStart is one of the messages which you just send the status and the headers.
So an interesting thing about this is that some servers, they behave different regarding when to send the headers and the status code. For example, Hypercorn, you first do all the processing of the request and send back the response. And then it's when it sends the status code and the headers to the client.
So it waits the second message to send all the data. But Hypercorn, for example, when it receives the HTTP response start, already sends to the client. So what can happen is, for example, you send back the 200 you're having using the streaming response on FastAPI.
You send back the 200, and then you have an error. And then, well, if you have an error and there is an exception on the server, that's what's supposed to be a 500. But it's too late. The client already received 200, so you see 200. But you also, if you go to Sentry or Lockfire, you see some exception there, even if the status code was 200.
But this is the most simple ASCII application you're going to find. So this is the scope. As I said, there are three parameters for an ASCII application. There is the scope, the receive, and send callables.
This is how the scope looks like. It's just a bunch of information about the request and the connection. So you can see the server and client IP there. And you can also see the HTTP version and the path, the parameters, everything you receive that makes sense to pass and have on the application.
And we use some of those information to, like, we make this cute for the users to use FastAPI. For example, the headers, you just use the uppercase header, and then you receive the date, for example. And then, so what happens if we saw the first example, this example, we just sent from the application to the server with the send.
And then the server sends back that data. But you can also receive a body from the client.
And the server, we receive that data, they receive the request. And then it sends back, it sends to the application an event called HTTP.request, which is used to read the body. And what you see down there with the circular form is how this looks like, how this event looks like.
So it's HTTP request, you see the body, how it looks like, and you see if it receives more body or not. It's more than body, it's just to check if the client is streaming data to the server. And yeah, this is how it happens.
You serve the client, this application processes the HTTP request and sends back an HTTP response start and then HTTP response body. And then now it starts the part that I stopped being boring and then I start talking more about how FastAPI really works and it's interesting and you can use at work.
So you have the middleware stack. So again, you have the server and you have several middlewares, right? And then you have the application, so it runs server, middleware, and application.
And if you have multiple middlewares, you have multiple middlewares in the middle and then you have the application. So for example, on this code you have two middlewares, the custom middleware and another custom middleware. So what's going to happen is that you receive the data on the server, the server sends to the application, but then
it passes through those middlewares, passes to the custom middleware, and then another custom middleware, and then it goes to the application. So this is how it really looks like. So natively, Starlet has two middlewares, the server arrow middleware and the exception middleware.
The exception middleware, it's used when you have the add exception handler and then you have something. So that's what the exception middleware is for. So you have a specific exception and you want to convert the specific response to the client. So this is where it happens. So if you have an exception on your endpoint, it goes up and you handle that gracefully, like you send something useful to the client.
But then if you have an exception that's not being handled anywhere, then it's handled by the server arrow middleware. Some exception happened, you didn't know what it was, and then you are able to see the traceback on your logs because of the server arrow middleware.
And then it also handles sending back the 500 to the client because we want to send something to the client to just make it aware that something happened. So how it happens, the server has this flow.
So if something happens on the application, it goes back, which means that if you send back the response from the application, it also goes back. So you can see the data that's being sent on every point and you can check, for example, the headers on one of them or you can process something. And then after the middleware, you have the routing.
And what happens here is that, again, you have the server, you have the middleware, you have the application is being called, and when the application is being called internally, you choose which endpoint is going to run. One interesting thing here is that you see that the choice of the endpoint, the route, comes after the middleware.
So if you need some information on the middleware about which endpoint is going to run, you actually don't have it. And you don't have it because the flow is like this. So you have first the middleware that's not aware of what comes after,
and then you have the choice, the routing, and the choice of the endpoint that you want to run. This is actually one issue that we have in Starlet. People sometimes complain because they want to see on the middleware, they want to have the data of which endpoint is going to run.
It's also helpful for observability or something like that because we received some complaints some time ago that, yeah, it's just going to be useful for observability tools to understand which endpoint is going to run. And usually, if you don't want to monkey patch, then you'd use a middleware on their side to check which endpoint is going to run.
But anyway, this is the flow. And so I have one example here of two endpoints, line 6 and line 11. I have a path parameter name, and I have another endpoint that's slash potato. Okay, if I call slash potato, which endpoint is going to run?
Who thinks it's the first endpoint? Raise your hands. Okay, who thinks it's the second endpoint? Raise your hand. Okay. So actually, the endpoint that's being called in this case, it's the first one.
And the way it works is the first match is the first one who runs. So this is a very common issue that happens when you start coding with a fast API. You're like, oh, why is this happening? Is this framework done?
But this was not the choice, but it happened a long time ago. It was just how it was. The first we match, it's the first runs. Some people have been asking to get an error, if that happens, if when you are registering the, like doing the rejects, calculate, compiling it,
to check if there is someone that will match you. But yeah, there were some complaints. There was no implementation. So this is something you need to be aware. Sometimes when you're trying to reach an endpoint, you don't see any messages, actually, because another endpoint is being reached, which is the case up here.
And yeah, so we talked about the middleware, middleware, and then there's the routing. And then after that, we go to the endpoint itself. So we have here the red root, and then here we have two parameters,
the A and B, which are two dependencies. And then what happens is that it goes up to call the dependency. So A calls dependency A, and B calls dependency B, and both of them calls the dependency call dependency. So if I call this, then what is it?
So if I call A, I'm going to call A, and then it's going to call dependency. And if I call B, it's going to call dependency, and it's going to call dependency as well. But then who thinks the number of called is going to be two?
Raise your hands. Okay, cool. But it's going to be one. So you run this, and then you call A first, and then the result of the dependency is going to be cached. And then you call B, B is called, and then the dependency is called again. But then this is already stored, so this is the value that was returned.
It's the one we use. The dependency works on the request response cycle, so it works once. And then on the next one, it doesn't use the cache. It fills it again, and then you can have the same data. So this is what I just said. You have the app calls A, and then it calls dependency.
Dependencies store the result and the hash of the functions being stored on the cache. And then you call app, sorry, you call call B again, and then it checks the cache if it's already there. It goes to web, and then you get the result.
And then another point of confusion that's not written in any place is when you should use async or sync properly. Is Thomas here? Thomas? No? Okay.
Well, he teaches me a lot of stuff, Thomas Grenger. So I have these two dependencies. They just differ in the fact that one is async and the other is not. So who thinks I need to use the one in the left with async? Raise your hands.
Okay. Who thinks I need to use the sync one? Okay. So what happens is that... By the way, the second people are wrong. The first one are right. So what happens is that if I run this with FastAPI,
it will run it, if I run the sync one, it will run it on a thread pool. So this is just doing not very expensive threading, doing nothing. So running in a thread will be more expensive because I'm using the thread than just having the first code.
So this is one point that there is a lot of mistakes happening because, for example, you see a lot of dependencies that are just getting a header or computing some small stuff, getting the value and then doing some cleaning and then going back. So for those cases that are not expensive,
that can run on the event loop, you should use just async. It will not use the thread, which will be more optimal. So we talked about the middleware,
goes to choose which endpoint is going to run, the routing, and then you see the dependencies. And then at some point I have all the dependencies and then I get the data that's coming in from the client, from the server, and then I need to validate the data. That's what Pydantic does for us.
So, for example, I have the input, which is a base model, and with that I'm validating that the body that I receive has two fields. It's a JSON format and it has two fields, a name and an age, name being string and age being an integer. So I'm just going to validate that that's really what's happening. So the types that I have really are what we have in the endpoint body
that we're going to use to do whatever we need and insert the data in the database in this case. And then when it returns, in this case as well, I'm assuming the database insert returns the same data plus the ID.
So you see the class output and I'm going to return that as well and it's going to make sure that everything is okay. The difference is that on the input, if it's wrong, FastAPI will give to the client a 422, unprocessable entity, and on the output what's going to happen is the client's going to receive a 500
and you as a developer on the server logs are going to see that there was an issue and then you're going to be able to fix it. So this is what happens. This is the wrap of things. You receive the data, you validate the data, you run the endpoint function and then you send back the data and then you validate what is returned.
And I just want to mention shortly the WebSockets. WebSockets in FastAPI are mainly just starlet code and so you just put the WebSocket type on the parameter WebSocket
and then you need to accept the WebSocket handshake. That's why on line 8 we have the WebSocket accept. So you send back, you tell the server that you need to tell the client
that everything is good and we can start WebSocketing and then you start exchanging the data. So on line 9 and 10 you iterate, ethertext. This is not the very usual, like this is a new format on how you should write WebSockets.
That's actually mainly why I put it here on this slide because usually on the examples that you see online you have while through and then you use receive text but with ethertext it gets a bit more cute so that's why I put it here. And then at the end of the exchange I just close the WebSocket.
And that's it. Don't forget to close the WebSocket because if the client doesn't close it and you don't send this line 11 it will keep open until the instance is shut down.
I mean if the client doesn't close it for you at some point. So this is what happens. The accept just does the WebSocket handshake and then you can start exchanging data. So I said about async and sync on the context of dependencies but I think on the last talk in Italy I gave the same talk
and I said you should only use sync if you have IO threading or if it's a CPU bound task, something like that. I said that and that's through. But there is a threshold on where it's worth it to use one or the other.
So it might be that you have this thought that you should use on those two but then it happens that when you actually run it it's faster to use async than to not use it. So there is an environment variable called Python async IO debug
that you can use and check if the code is too slow on the event loop and then you can at that point if it is then you can use the remove the async and use only the sync and then FastAPI will use the thread pool.
I have created some tips on this repository. You just Google FastAPI tips and put that on GitHub. And we have a booth. I am working at Pydantic. We've created a product called Logfire. It's an observability tool. It looks cute. We are there on the booth and Samuel, the creator of Pydantic,
is also there so we can talk to him. And thank you for being here. Thank you very much for the talk. We have a few minutes of Q&A.
If you have a question you can step up to one of the microphones and while we wait for people to step up there is one of the online questions. And that was about the slide where you showed the name matching from the name and the potato. And the comment is wouldn't a simple solution to this rooting problem not be to root by how specific they are?
So let's like have the string potato match first and then have the brackets name because it's not as specific like a string. You mean changing the order? When you had the order of matching so that the end point would match with the brackets,
if you sent the word potato and you had a potato later that would not match. If we can't solve this within seconds then we'll take questions from the room. I can answer that later. Hello. I also have a question about the rooting.
For the custom routers that you can add to a fast API, if you have a router with prefix is the prefix itself resolved before resolving the URL? So for example if I have two routers with the same prefix will only views from the first router be accessible and from the second not
or all views should be accessible if they don't have colliding URLs down the line? You mean like for example if I had potato and something else there? If I have like other registered routers itself like two separate routers and inside of them I registered views with not matching URLs
but the routers itself have the same prefix. So for example the same prefix can be name here and potato, right? Or you mean potato and potato?
I don't mean in the view itself. No, I understand it's in the router but you mean like for example if I have name and then potato and then that will match and I have something else after. Or if they are even perfectly the same I don't know you have a router with V1 and other is also V1 because you register it somewhere else in your code.
And then? Will it match only the first router or will it like progress through the views No, no, it's going to find everything. It's going to find everything. The thing is if you have something else it's going to find everything. If you just have at the end you have those two then it's just going to match the first one.
Okay, thank you. Thank you for the question and the microphone here is empty so please ask the next question from there. Hey, thanks for the talk that was interesting. Could you pull up the slide you had where you did the database insert? Yeah, this one. So I find I often have a similar pattern where I need to say inject some database controller
or some other instance into the route so then the route can say insert the data. One thing I'd like to be able to do is inject the instance of the class into the route and have the framework kind of handle doing that injection.
But I find I haven't found a good way in order to be able to say inject like single instances of custom classes. So use some kind of work around where say in the lifespan I create my custom classes
and then store them in the state of the app and then create another route where I've got specify as a dependency I want to take this instance from my app state and then I can use the depends kind of magic to get fast API to inject.
I was wondering if there's something kind of cleaner for doing that. Adrian, the other maintainer of Starlet, he created this package called a SAP API and he does something like this. Is it more or less what you want?
And then he doesn't use the lifespan, he uses something that he called binding, I think. Is it somewhere here? Yeah, it's binding. And then he binds it and it can use these cute injected on the endpoints. Okay, yeah. There was another package called dependency injection that someone wrote that did something kind of similar.
But the maintainer of that has kind of gone AWOL. So I'll check this out. This one is the maintainer of Starlet. Okay, great. Thanks. Thanks for the question. That's all the time we have now. The only thing left is to have another great round of applause for our speaker. Thank you. Thank you.