Stuck in the Middle: Leverage the power of Rack Middleware
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 17 | |
Number of Parts | 89 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/31563 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
RailsConf 201617 / 89
1
2
3
5
7
8
11
12
19
22
28
29
30
31
33
45
47
50
52
53
58
62
67
71
74
76
77
78
79
80
81
84
85
86
88
89
00:00
NumberProduct (business)Server (computing)MereologyCodeWordMathematicsComputer-generated imageryInteractive televisionSoftware developerDisk read-and-write headCartesian coordinate systemMappingDifferent (Kate Ryan album)MiddlewareIntegrated development environmentVariable (mathematics)Instance (computer science)Point (geometry)WeightInformationView (database)FamilyFigurate numberSubject indexingSlide ruleEndliche ModelltheorieRankingInterface (computing)Multiplication signDependent and independent variablesLogicBitWeb browserGame controllerReal numberQuicksortCovering spaceSheaf (mathematics)WritingComputer clusterHydraulic jumpExistenceMilitary baseSoftware bugSubsetComputer animation
06:26
Instance (computer science)MiddlewareArithmetic meanIntegrated development environmentCartesian coordinate systemComputer fileDiagramServer (computing)EmailMessage passingPower (physics)Variable (mathematics)MereologyRule of inferenceDependent and independent variablesPattern languageHash functionMixed realitySocial classCASE <Informatik>Web applicationWeb 2.0Game controllerParameter (computer programming)CodeControl flowGreatest elementEndliche ModelltheorieBitStrategy gameInformationSet (mathematics)AdditionMobile appHTTP cookieSoftware frameworkFlash memoryWater vaporAuthenticationInformation securityDataflowNamespaceSystem callLoginCore dumpContent (media)ParsingException handlingStack (abstract data type)Library (computing)Zoom lensComputer animation
12:43
Variable (mathematics)Integrated development environmentInformationDependent and independent variablesMobile appMiddlewareBitServer (computing)Structural loadResponse time (technology)Cartesian coordinate systemAuthorizationMultiplication sign2 (number)CodeStack (abstract data type)Bit rateSocial classLimit (category theory)System callPattern languageCore dumpEmailComputer wormLine (geometry)Library (computing)RoutingObject (grammar)Different (Kate Ryan album)ResultantNamespaceImplementationMessage passingLogicWritingLoginProcess (computing)NeuroinformatikComputer configurationNumberSubject indexingCASE <Informatik>Computer fileUniverse (mathematics)Arithmetic meanSet (mathematics)
21:00
Library catalogCartesian coordinate systemProduct (business)Interface (computing)System administratorWebsiteGroup action
21:32
Element (mathematics)Computer fileForm (programming)Coma BerenicesWebsiteRoundness (object)Dependent and independent variablesDemosceneLine (geometry)RoboticsProfil (magazine)Wave packetRoutingSource codeXML
22:46
Fluid staticsDependent and independent variablesOrder (biology)Computer fileType theoryPoint (geometry)MiddlewareMobile appResponse time (technology)Object (grammar)Inheritance (object-oriented programming)CodeRoutingReverse engineeringVariable (mathematics)Cartesian coordinate systemLogicNumberSoftware bugGame controllerCausalityConfiguration spaceGreatest elementMathematical singularityTask (computing)LoginBitMechanism designDot productGenderInteractive televisionAnalogyWordBit rateStack (abstract data type)Water vaporLine (geometry)Function (mathematics)
28:37
INTEGRALLogicExecution unitMobile appException handlingFlagSubsetThread (computing)CodeError messageCartesian coordinate systemMiddlewareContext awarenessSoftware testingSoftware bugSoftware developerInstance (computer science)MultiplicationVariable (mathematics)Structural loadLibrary (computing)System callHacker (term)Entire functionRule of inferenceService (economics)MereologyStrategy gameTheoryChemical equationEndliche ModelltheorieWordData structureUnit testingDependent and independent variablesNumberComputer animationLecture/Conference
34:29
MiddlewareBitLine (geometry)PlastikkarteCartesian coordinate systemLecture/Conference
35:06
WeightComputer animation
Transcript: English(auto-generated)
00:12
Thank you for coming to this talk. My name is Amy Unger, and I'm here to talk, I'm here to learn that my slides are becoming sentient,
00:23
but also, I'm here to talk about a part of the Rails ecosystem that I ignored for quite some time, and I came to regret that. It's easy as a new Rails developer to ignore Rack Middleware. I mean, it's got words like rack and middleware,
00:40
and those words are scary, especially when you're still getting to the point where you're comfortable with the concepts of models, views, and controllers. And then, as I became a more experienced developer, I ran into gems that relied on middleware, and even small pieces of middleware in the code bases I was working in.
01:03
And it's really easy to just pick out the parts of the middleware that seem relevant to the changes you need to make, or the bug you're trying to track down. And so, it's pretty easy to think that there's not much more to Rack Middleware, because to be honest, there isn't.
01:21
It's designed to be a simple, but powerful interface. But I never really took the time to understand what was going on. So at the beginning of my career, Rack Middleware seemed far too advanced for me, and then it jumped straight to being too boring and too obvious.
01:41
Now, I'm gonna let you guys in on a secret. I wrote some pretty bad middleware because of that. Probably the obvious conclusion, but I wrote middleware that wasn't thread-safe. I didn't push back when middleware that should never have been middleware was written. And I maintained some sprawling middleware
02:02
that is pretty much unintelligible because it is so sprawling. And finally, I didn't write middleware when I should have. I didn't know that it was a tool that I could use. So today, I want to address some things that I would have loved to know and some of the mistakes that I made,
02:22
so you don't have to make them. So first, I'm gonna talk about what is Rack Middleware. I'm gonna go through some examples about how we build Rack Middleware. I'm gonna cover why you might want to use Rack Middleware as a tool. And because I got on a theme of my section headings,
02:41
this last one is called who, as in who did this mistake, who made this, which is, I think it's when I step in one spot that it moves, okay. So who is talking about things that your successors will try to find you
03:01
and track you down and tell you why you shouldn't have done that. So let's start with what. I'm gonna jump into what Rack and Rack Middleware are. Look at how they fit into Rails, and then take a brief look at some familiar examples of Rack Middleware. So what's this Rack thing?
03:24
So let's go to a world where Rack doesn't exist. And you need a server, for instance, for this example, I'm just going to use a CGI server. So your user is going to make an HTTP request here through a browser. It's gonna hit your CGI server.
03:43
Now your CGI server is going to take that request and parse it into different parts of data that it's going to shove onto the environment. So it sets about 20 or 30 environment variables.
04:03
Some of these should be somewhat familiar. Path info, HTTP accept for your headers, all this sort of stuff. It's shoving into the environment. And then it will run your code. Now your application code will then write to standard out,
04:22
which the CGI server will pick up on, formulate that into an HTTP response, and send it back to your user. Now what that means is that your application needs to know that it's being run by a CGI server.
04:42
It needs to know that it can pull out those environment variables to figure out, hey, someone's asking for index.html. And that's a bit of a problem for many of us, because we tend to develop with one server,
05:01
and then deploy to production with another. So you're using multiple servers in the course of a day, and your application has to figure out what's running it. So let's move to a world with Rack. So when we shove Rack in the middle of your server and your application, the situation looks a little bit different.
05:21
Your user makes an HTTP request to your server, here it's WebBrick, it can be anything. That server knows that it should talk to Rack. Rack parses the information it gets from the server into a standard incoming request that is the same for any different server.
05:44
And that means that no matter the server that is running your application, your application can write the same logic. And then coming back up the stack, it's the same thing. Your application just returns the response
06:01
in a Rack-compliant way, and Rack figures out the details of how it should then talk to the server. So what does Rack look like to your application? To your application, it looks like there's an incoming request with the environment hash. We can see that Rack took inspiration from CGI
06:28
because it essentially took those environment variables and wrapped them in a hash and called that the environment but it's no longer setting those variables on the environment, it's passing that into your code.
06:41
The outgoing response that your app is going to return includes the status code, the headers, and the content body. So let's take a look at the simplest Rack app we can create. Rack apps need to follow three rules.
07:01
We need something we can call, whether that's a class or an instance method, or the proc we have here. That thing needs to accept the environment so that the environment's going to have that hash that includes path info, headers, all that information. And then it needs to return an array with the following.
07:20
So here we can see the status, which is 200, a hash of headers, here we're just saying we're returning HTML, and an array of the content body. So those are the three rules that you need to follow to be a Rack app. So what is Rack middleware?
07:43
So if we look at this diagram and then zoom in on Rack, we can see, well, we're going to look at the three parts of working with Rack. Now Rack took their logo inspiration from a server Rack,
08:01
but I'm gonna use that to display the three different parts. So the first part is the handler for your server, Webrick, Mongrel, CGI, Puma, et cetera. There's a handler for it. The next part is the adapter that's at the bottom
08:23
that will talk to your framework. And so far we have a pretty simple setup. The request is just gonna flow through Rack and get transformed for the server and the framework. What if we want to do things to that request
08:43
before it hits the server, coming back up as the response or hits your application coming in? Well, that's where middlewares come in. So middlewares allow you to work with the request or the response before it exits, either at the bottom of the stack or the top of the Rack stack.
09:03
And just to show some of the power of what Rack middleware can do, here are some examples of middleware that Rails provides to you. So first off, Rails uses middleware to serve up static files. It uses middleware to set up logging for each request
09:22
and then flush all logs at the end of the request. Uses middleware to set up a cookie for the request, to handle flash messages, and to parse out params. So the params that you're used to in controllers, that's done in middleware. In addition to middleware in the Rails core code base,
09:46
within the Ruby web app ecosystem, there are other notable gems that use middleware as a strategy. Some of them include being used for throttling, for security for honeypots, and for authentication with warden.
10:03
So let's take a quick look at how we're going to build our middleware. So first we're going to write a very basic middleware. This is just going to be a ping setup. We're just going to be able to ask our application, hey, are you up and running?
10:23
So first we'll create a file in lib middleware ping. We'll create a class called ping. And to make sure that there won't be any namespace clashes, we'll throw that in a module.
10:41
So now we have this class, middleware ping. We're going to write an initialize method. Now every piece of Rack middleware needs to accept, when it's initialized, it needs to accept the app. Now the app could be your Rails app, but it could also be another piece of middleware.
11:02
You can imagine middleware as a set of Russian nesting dolls. Each one calls down to the smaller one until it hits your app. So really what's happening here is our middleware is going to be initialized with the next middleware down the stack. If it's the very last middleware to be called,
11:22
it will be initialized with our Rails app. So a middleware has to follow the same three rules as an app. We need something that responds to call, it must take the environment, and it must return the status, headers, and content body.
11:41
So our call method here is going to look very much like a Rails app or any Rack compliant app, except for the fact that it will need to call down the stack. So let's write the simplest thing that is Rack compliant. So you can see our call method here
12:01
is accepting the end, and it's returning a Rack compliant response. This is really cool. This is going to work. When we hit our app, we will receive a response of Pong. Unfortunately, we're gonna see that in every single case. This will never call down the stack. Every request to your app will respond with 200 Pong.
12:25
Probably not what we want to do. So let's fix that. So first, let's take a look at what that request is. We're going to take that end, parse it out into something that's a little bit better to work with.
12:42
Request is going to have some of these environment variables set. And we can see here that we can look to see whether it's a get, a post, a put. We can see that path info. And so with that information,
13:02
we're going to be able to finish this method. So in this first line, we parse out the end into the request object. We then check to see if the request path is the route that we want to match on.
13:20
If it's not, we're just gonna call down the stack. We're not gonna do anything. We're just gonna pass everything down. And we're going to immediately return what the calls down the stack want to respond with. If the user did request that ping route, we're gonna return 200 Pong.
13:43
All right, so let's look at a less basic middleware. Request response time logging. Now this is actually one of the most common middlewares I've seen written for small pieces of middleware. It's amazing how often you do not have beautiful things like New Relic.
14:01
And instead you're on a solution where you need to do a little bit of your own logging and monitoring. So what we're gonna be doing is we're gonna be tracking how much time it takes your app to complete a request to index.html to any route. So we'll start with the same pattern we saw for ping.
14:21
We'll take a new file, lib middleware request time logging, we'll make a new class, again, name spaced. We'll create this initialize method, which again has a reference to the app that it needs to call down the stack.
14:44
And now we get to the meat of this middleware, the call method. So it's going to take the env as we know it needs to, and it's going to call down the stack. Because we want this piece of middleware to always call down the stack, it's not intercepting and returning any requests,
15:01
it's not dropping any requests on the floor, it's always going to call down. We know we need to do this. We know that we need to call down the stack. So let's get some timing in here. We can start with recording the start time of the call, and then we compute the elapsed time.
15:22
Now this is nice, we're calling down the stack, and we know how long it takes. But the problem is, we never actually return a Rack compliant response. As you can see here, we're returning just the number, the elapsed time in seconds. We probably actually want to return a Rack compliant response.
15:41
So the way we do that, we know we want to return the status headers and response. We know that we'll get that data from calling down the stack. And so we just save that data off as we get that response coming back up the stack
16:03
into our middleware, and then we return it. So now our app is working fine. Our users are hitting various endpoints in our app, and they're getting responses. But we still haven't logged our elapsed time anywhere.
16:21
We're computing it, sure, but it's not being sent anywhere. So let's get working on that. We're going to do that in another method. We'll call that log response time, and that method will need to know how long it's taken to call down the stack.
16:42
And because it's important for us to know what the path is, it's not really helpful for me to know that it took 54 seconds for this request to complete. If I don't know what route the user was hitting, we'll also send in the request. Now let's get to writing that method.
17:01
We want to make sure that it's a private method. There's really nothing on a middleware besides the initialize method and the call method that needs to be public. So it's a private method. It takes elapsed time, and the request will set up a JSON payload
17:23
with that data to be sent to our logging. And because the most common setup, the most common reason to do this is that you're using something like Splunk. Here's just an arbitrary implementation of sending it to Splunk with your instrumentation of request.responseTime and sending the payload off.
17:43
And that middleware is now complete. I want to quickly review a example of middleware in a gem, because it's often that you're going to be debugging something, and it's good to just take a quick look
18:02
at how you would deal with a larger code base than something you might write quickly for your own app. So the example here is for throttling. Middleware is a great option for throttling because it can drop requests on the floor, or return them, hopefully in this case, we're going to be returning with an unauthorized
18:25
response without actually putting any load on our application. Our application never actually sees this. Our server doesn't have to deal with the load of bringing that response all the way down to the application. We can return as soon as possible.
18:42
So rack throttle, here on GitHub. One of the interesting things you'll see when trying to debug middleware gems is that you need to find the core middleware class, the class that has the initialize method and the call method, and they're always named differently.
19:01
For this one, it's the limiter class. But once you find it, you'll see that there's an initialize method that does take app. It also takes options, which is an incredibly useful thing for allowing people to configure your code
19:20
if you are packaging up a middleware as a gem. But first, the important thing is the app. And then, a call method. This call method, as we know, takes the environment. And then on this second line of the call method,
19:41
there's some pretty short logic. It asks, is this request allowed? If it is, call down the stack and return the result of calling down the stack. If it's not, call this method, rate limit exceeded, and presumably we're going to return an unauthorized code response with some helpful message.
20:05
And I wanna show this to you because it shows how little rack middleware code you really need to write before you can really jump in to writing code that is specific to your application, to what you want to do.
20:21
So you really don't need to understand too much of rack middleware here to get going with writing a throttler. Now, all you have to know is how to throttle, which is a totally different topic, one that would be very interesting but not this talk. So why would we want to write pieces of rack middleware?
20:43
What is this tool best used for? Middleware can simplify your application. Middleware is very good at dealing with requests that your application should never see. So a good example of this is a situation at my job
21:00
at Heroku where we used to support a website called addons.heroku.com. This would be a site where you could see all of our Heroku add-on offerings as essentially a catalog of products. Now behind this application also lives an admin interface
21:21
for managing those add-ons. But the product data, essentially the store, has moved to elements.heroku.com. Now, what this means is that we have a lot of people out there who still want to go
21:41
to addons.heroku.com slash New Relic and see data about New Relic. And those routes are hitting addons.heroku.com. But addons.heroku.com currently has no idea what a front-facing add-on is. All it knows is how to administer it. So our route file for addons.heroku.com
22:03
ends up being hundreds of lines long because it needs to redirect users to our new marketing site. Even when add-ons really shouldn't know anything about selling add-ons. It just needs to know how to administer them.
22:22
So the solution here is to pull out some of those routes from that route file, move them into middleware, so that when you hit addons.heroku.com slash New Relic, you don't even need to hit the addons.heroku.com route file. As a developer, I can deal with a very simple route file
22:40
and you guys get a response far faster which says, hey, redirect to this new site. Moving on, middleware can protect your application. So continuing on the theme of handling requests before they get to your application, there are certain types of requests
23:01
that you just don't want your application to deal with. There's no point if you know that a particular request is malicious, there's no need for it to hit your app. Your middleware can handle it. And along this line, the throttling example that we saw, if your app shouldn't ever receive that request,
23:21
your throttler can stop it. Also similar is implementing honeypots. Middleware sees both the request and the response object. So when a request comes into your application, it comes in through the routes file and it can hit any number of controllers
23:40
and it often will exit, again, through any one of those controllers. The only singular piece of code, and in fact singular method, that you have easy access to, that sees both that request object coming in and the response coming back up,
24:00
is a piece of middleware. And so this is how we know we want to write that request response timer in middleware because that one method can be confident that it will see any request coming in as well as the response coming back up.
24:25
Middleware can be a code sharing mechanism. So this may not be so interesting to you if you are writing middleware for your own personal apps, but if you do have a piece of code that you want to share as a gem, middleware can be an important tool
24:43
to sharing that code and allowing users to easily just drop it in without needing to do much additional configuration. So now, who, the things that will trigger
25:02
your successors to git blame your code. First I want to talk about order. Order in Rack Middleware is important. Going back to the Russian nesting doll analogy, it wouldn't make sense to try to fit
25:20
the largest Russian nesting doll into the smallest one. It's not going to happen. Similarly, with Rack Middleware, there's a necessary order. So Rails provides a really nice rake task to tell you which middleware are being configured
25:41
to be used with your application right now and the order that they're in. So if you run rake middleware on any modern Rails app, you'll see this output, and this is from top to bottom, the order in which your Rack Middleware will run
26:02
as the request is coming in, and then as the response is coming back out, reverse order as the response comes up. So let's look at some examples of how order can be a difficult thing
26:22
and cause bugs for you. So taking the example of sending, returning immediately for static file requests, we can see at the top of the rake middleware command that Rails is going to immediately respond
26:43
with static files. We can also see down below that Rails is setting up the request ID and logging for the individual request. Now what this means is that if you're looking in your logging for all the requests for static files,
27:03
you're not necessarily going to see them. In fact, with this configuration, you certainly won't. So unless you have alternate monitoring, you're not going to know that you're being DDoSed by someone asking for logo.ping.
27:24
Now if you wanted to make sure that all of those static files do get logged, you can just move the order of how you load in your Rack Middleware so that you are logging those requests.
27:45
So another example, configuring warden. So we're really excited in this example. Let's say we're really excited to add warden. We add it right at the top. We're super excited.
28:01
And we can see lower at the bottom, we're actually setting up that session. Now this is a problem if we read the warden documentation because warden must actually be downstream. It depends on having session variables set. And so for warden, we need to make sure
28:24
that when we first start using warden, warden Rack Middleware gets installed and gets used at the bottom of the stack. Next, application logic. So earlier I talked about how Rack Middleware
28:41
can help you simplify your application by pulling irrelevant parts out of the app. But there's nothing in theory stopping you from obscuring your entire application in Middleware. This could be your entire app for an incredibly complicated application where everything is just handled through Middleware.
29:02
That's actually something that's kind of interesting to think about if you wanted to compose your application of small services. But for most applications, you'll probably actually want a full application that handles most of your business logic. You'll want one place where people can look
29:20
when they want to debug things. You don't want to be dealing with Rack Middleware and dealing with business logic in Rack Middleware as well as in your application. So some red flags for maybe you should consider moving this logic out of Rack Middleware
29:41
and into your application. First of all, if you're modifying the request. Now there are plenty of things that are going to add to the request. Many things that are going to help set up your application to handle the request. But if you're modifying or overwriting things like post data, things like the request path,
30:05
you're probably on the wrong path. Next, awareness of business logic. It's hard to search for bugs across multiple places, so you probably want to keep your business logic
30:21
in the place that people expect it to. It's also not the easiest necessarily to test business logic in Middleware because you are mostly working with either unit tests for that Middleware or in integration testing, acceptance testing. So you're probably going to end up with more bugs
30:42
if you split your business logic between your app and your Middleware. Another thing that's very similar is if it has awareness of the models, if it has awareness of the data structure, you're probably on the same path
31:01
of something that's not going to be as maintainable. I don't know your app. Maybe you're doing these things and maybe they work for you. So what are some mitigation strategies if you're going to be adding in application logic to your Rack Middleware? So the first suggestion I have is to use app Middlewares.
31:24
So if you are going to have that app logic, make it very explicit that that's what you're doing. It makes it easier for someone who's coming in who's debugging something to search for some keyword in app and find that code.
31:40
It also makes it clear to a new developer coming on that yes, the Middlewares are something I need to learn. They're not extra libraries that I can reduce my cognitive load by learning later when I might need them. If it's particularly hackish, use app hacks or lib hacks to add that extra red flag
32:02
of hey, we know we're not super comfortable with this, but it's still something that we want to do. And then finally, use keyword comments. Make it easy for someone who's searching for this bug three years after you've left
32:20
to find the code by using, throwing in comments some keywords you think they might be searching. Always return a response. So beware that you are in a stack of Middleware and that you need to comply with the three rules
32:41
of Rack Middleware. Exceptions are not part of that. If your Middleware is returning, is throwing an exception that it never catches, that's probably not a good thing. What that's going to result in is a 500 error that's not pretty.
33:01
You're not gonna get the Rails, you know, oh, I'm sorry, something's wrong and we're on it page, because your Ruby code has errored and Rack is gonna try to do its best, but it's gonna be ugly.
33:20
So final thing, thread safety. Thread safety in Rack Middleware is only important if you are setting instance variables. Now that usually implies a far more complicated piece of Middleware than what I've written with you guys today.
33:43
So we'll make ping thread safe, even though it doesn't need to be thread safe, because we're not updating app, please don't update app on the fly. Because we're not updating app, we don't need to make this thread safe, but as an example, let's do it.
34:02
So we're gonna take that call method, all we have to do is dupe the instance of Middleware. We then move all the logic that we put into the call method into another private method,
34:21
the convention is underscore call, and then we're done. So we reviewed today what Rack and Rack Middleware are, some of the reasons why you would want to use a piece of Rack Middleware in your application,
34:43
a little bit about to make sure you're doing smart things with Rack Middleware, and generally I hope this talk has made you feel more excited about using Rack Middleware sometime down the line. So thank you.