Site Availability is for Everybody - TIB AV-Portal

Site Availability is for Everybody

00:00

2

Formal Metadata

Title

Site Availability is for Everybody

Title of Series

Part Number

35

Number of Parts

89

Author

License

CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/31557 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Your phone rings in the middle of the night and the site is down— do you know what to do? Whether it's Black Friday or a DDoS attack, our Ruby apps and Ruby devs have to be prepared for the best and the worst. Don't let a crisis catch you off guard! Fortunately, you can sharpen your skills ahead of time with load testing. Learn tips and common pitfalls when simulating application load, as well as key metrics and graphs to understand when site availability is compromised.

RailsConf 201635 / 89

1

1:11:43

Opening Keynote by Daer

2

48:09

Closing Keynote: Skunk Works

3

39:27

The Rails Boot Process

4

40:04

Make a Rails App with 140 Characters (or less)

5

30:07

How Sprockets works

6

31:31

Writing a Test Framework from Scratch

7

48:19

Opening Keynote by Henry

8

09:45

Ruby Hero Awards

9

43:16

...But Doesn't Rails Take Care of Security for Me?

10

32:39

Will It Inject? A Look at SQL Injection and ActiveRecord

11

41:52

Turbo Rails with Rust

12

40:19

RSpec and Rails 5

13

42:15

From Director to Intern: Changing Careers as a Single Mom

14

36:15

Introduction to Concurrency in Ruby

15

29:45

5 Practical ways to advocate for diversity

16

39:51

Power up Your Development with RubyMine

17

35:16

Stuck in the Middle: Leverage the power of Rack Middleware

18

34:06

Surviving the Framework Hype Cycle

19

28:32

Rails to Phoenix

20

38:23

How we scaled GitLab for a 30k-employee company

21

36:51

Developing and maintaining a platform with Rails and Lotus

22

28:05

Going Serverless

23

30:45

Build Realtime Apps with Ruby & Pakyow

24

40:22

Don't Forget the Network: Your App is Slower Than You Think

25

37:36

Quit Frustrating Your New Developers - Tips From a Teacher

26

38:17

Precompiling Ruby scripts - Myth and Fact

27

27:18

Multi-table Full Text Search with Postgres

28

27:13

Rediscovering ActiveRecord

29

38:06

Saving Sprockets

30

40:50

Inside ActiveJob

31

33:58

The State of Web Security

32

35:42

Implementing the LHC on a Whiteboard

33

51:17

Opening Keynote by Patterson

34

29:27

Finding Translations: Localization and Internationalization

35

29:08

Site Availability is for Everybody

36

23:16

Client/Server Architecture: Past, Present, & Future

37

39:04

How to Get and Love Your First Rails Job

38

33:04

I Can’t Believe It’s Not A Queue: Using Kafka with Rails

39

32:42

From Excel to Rails: A Path to Enlightened Internal Software

40

39:55

Level-up Your ActiveRecord Skills: Learn SQL!

41

39:18

From Zero to API Hero: Consuming APIs like a Pro

42

28:00

Zen and the Art of the Controller

43

30:38

Facepalm to Foolproof: Avoiding Common Production Pitfalls

44

28:46

Packaging and Shipping Rails Applications in Docker

45

37:58

Small Details, Big Impact

46

19:07

How Compose uses Rails to Scale Work, Now Open-Sourced

47

29:47

Crushing It With Rake Tasks

48

35:31

Step 1) Hack, Step 2) ?, Step 3) Profit

49

36:49

Your Software is Broken — Pay Attention

50

25:29

Priming You for Your Job Search

51

34:22

Strong Practices for Rails Applications Continuous Delivery

52

28:16

Excellence Through Diversity

53

32:30

Hiring Developers, with Science!

54

31:05

Internships: Good for the Intern, Great for the Team

55

40:56

Booting Up: Hiring and Growing Boot Camp Graduates

56

32:31

Reduce Small-Team Culture Shock with Agile

57

39:08

ActiveRecord vs. Ecto: A Tale of Two ORMs

58

20:39

Testing Rails at Scale

59

39:17

ActionCable for Not-Another-Chat-App-Please

60

33:41

Rails 5 Features You Haven't Heard About

61

37:36

Turbolinks 5: I Can’t Believe It’s Not Native!

62

43:07

Secrets of Testing Rails 5 apps

63

40:46

Riding the Latest Rails for Charity

64

36:31

Style Documentation for the Resource-Limited

65

20:23

Bridging the Gap between designers and developers

66

31:48

Continuous Visual Integration for Rails

67

35:58

Make Them Click

68

39:15

Real World Docker for the Rubyist

69

27:01

Building Applications Better the First Time

70

34:51

The Guest: A Guide To Code Hospitality

71

28:45

Frameworks for Feedback

72

30:58

Managing Growing Pains: Thinking Big While Being Small

73

27:19

Can Time-Travel keep you from blowing up the Enterprise?

74

1:33:14

RailsConf 2016: Lightning Talks

75

41:10

Pragmatic Lessons of Rails & Ruby in the Enterprise

76

11:13

Sponsor: Indeed

77

42:05

How to Build a Skyscraper

78

38:14

Get a Whiff of This

79

08:44

80

23:08

Postcards from GorbyPuff

81

40:38

82

28:53

Foreign API Simulation with Sinatra

83

42:11

3x Rails: Tuning the Framework Internals

84

30:29

Your First Legacy Codebase

85

40:10

Pat Packet Visits Ruby Rails

86

34:06

Storytelling with Code

87

41:53

Tweaking Ruby GC Parameters for Fun, Speed, and Profit

88

59:28

Closing Keynote by Lamere

89

17:11

How We Deploy Shopify

Automatic playback

Speech

Text

Image

00:00

BitMultiplication signCellular automatonRange (statistics)AreaWater vaporReal numberSoftware testingSession Initiation ProtocolRing (mathematics)Grothendieck topologyAverageComputer animation

01:59

Group actionRandomizationControl flowGrothendieck topologyInheritance (object-oriented programming)Numbering schemeSoftware developerServer (computing)Entire functionEvent horizonArray data structureComputer animation

02:59

Stress (mechanics)Computer simulationSoftware testingWebsiteResultantStructural loadCodeWordVolume (thermodynamics)MathematicsConfidence intervalMeasurementGrothendieck topologyBit rateGroup actionBuildingError messagePhysical lawOperator (mathematics)Level (video gaming)Cartesian coordinate systemWeb pageGoogolFluid staticsFrustrationInheritance (object-oriented programming)QuicksortComputer animation

05:03

Structural loadCuboidSoftware testingFreewareLocal ringProduct (business)Physical systemServer (computing)Computer simulationControl flowPoint (geometry)Web 2.0NeuroinformatikWeb pageWebsiteHome pageDatabaseLevel (video gaming)Stress (mechanics)ResultantConfiguration spaceGrothendieck topologyTrailComputer animation

06:56

Software testingTask (computing)Video gameNumberBenchmarkClient (computing)Physical systemWeb 2.0Cache (computing)Web pageMereologyFlagMultiplication signStructural loadTerm (mathematics)Web browserTotal S.A.Cartesian coordinate systemServer (computing)WebsiteConcurrency (computer science)VolumenvisualisierungScripting languageBitComputer animation

08:26

Function (mathematics)Video gameLengthSoftware testingTrailStructural loadServer (computing)Multiplication signFlagResultantGoodness of fitContent (media)Validity (statistics)Graph (mathematics)Grothendieck topologyBlack boxRandom number generationRevision controlDependent and independent variablesReal numberLoginProduct (business)MultilaterationNumberLine (geometry)Error messageZoom lensLevel (video gaming)HypothesisMathematicsWeb pagePhysical systemSheaf (mathematics)Link (knot theory)Dynamical systemCellular automatonLogicResponse time (technology)Variable (mathematics)Set (mathematics)Computer animation

11:33

Volume (thermodynamics)Mobile appStructural loadSoftware testingConcurrency (computer science)Error messageOnline helpKey (cryptography)Computer animation

11:57

Proxy serverAverageMultiplicationPhysical lawThread (computing)Total S.A.LengthServer (computing)Client (computing)Line (geometry)Cartesian coordinate systemStructural loadResponse time (technology)Virtual machineWeb 2.0Data storage deviceWeb pagePhysical systemMultiplication signNumberPhysicalismBit rateThresholding (image processing)Computer fileWeb applicationShared memoryProcess (computing)Repository (publishing)FrequencyConfiguration spaceLatent heatDatabaseSystem callQueue (abstract data type)Error messageSingle-precision floating-point formatTerm (mathematics)QuicksortSound effectArithmetic meanDependent and independent variablesVolumenvisualisierungMoment (mathematics)Nichtlineares GleichungssystemLie groupWeightTheorySampling (statistics)Grothendieck topologyFirewall (computing)Video gameRevision controlNatural languageSimilarity (geometry)Observational studyGame theoryMassLevel (video gaming)Social classReal numberSource codeSerial portSign (mathematics)HypermediaComputer animation

16:47

Local ringLevel (video gaming)Software testingPhysical systemVirtual machineCartesian coordinate systemStructural loadSound effectServer (computing)Operating systemProduct (business)Configuration spaceComputer animation

17:16

Level (video gaming)Limit (category theory)Default (computer science)NumberConfiguration spaceProcess (computing)Operating systemProxy serverError messageIP addressSet (mathematics)Computer fileKernel (computing)MathematicsPhysical systemVirtual machineThumbnailWeb 2.0TrailSubsetClient (computing)Rule of inferenceDifferent (Kate Ryan album)FrequencyMultiplication signSemiconductor memoryStructural loadSinc functionFinite setWeightCartesian coordinate systemRange (statistics)FreezingSoftware testingSoftwareConnected spaceNatural numberGame controllerComputer animation

20:34

Integrated development environmentSoftware testingCartesian coordinate systemLogical constantDatabaseLoginWeb 2.0Query languageCycle (graph theory)Structural loadLimit (category theory)Real numberDifferent (Kate Ryan album)Software frameworkCache (computing)Scripting languageVideo gameVirtual machineComplex (psychology)QuicksortSound effectServer (computing)Row (database)DivisorError messageGroup actionWebsiteUniform resource locatorPoint (geometry)Computer animationLecture/Conference

22:51

NumberQuicksortDatabaseCartesian coordinate systemVisualization (computer graphics)Structural loadHard disk driveBefehlsprozessorPhysicalismWeb 2.0Virtual machinePressureTotal S.A.Core dumpGrothendieck topologyRepresentation (politics)CuboidSemiconductor memorySoftware testingConsistencyAddress spaceAverageSingle-precision floating-point formatServer (computing)Process (computing)Product (business)Firewall (computing)Physical systemPairwise comparisonGame theoryPoint (geometry)Proxy serverWebsiteReal numberConnected spaceSummierbarkeitResultantLeakComputer animation

25:41

Cartesian coordinate systemDatabase transactionStructural loadProduct (business)Profil (magazine)Asynchronous Transfer ModeServer (computing)Integrated development environment1 (number)Software testingXMLUMLSource code

26:22

Structural loadDatabase2 (number)Frame problemQuery languageStatement (computer science)SequelProcess (computing)Default (computer science)Cache (computing)Multiplication signLinear regressionComputer animation

27:11

CuboidMobile appSoftware testingStructural loadTwitterSlide ruleLink (knot theory)Virtual machineSet (mathematics)Video gameDifferent (Kate Ryan album)Physical systemComputer fileInstance (computer science)Configuration spacePlanningDenial-of-service attackGrothendieck topologyCartesian coordinate systemWeb applicationOpen sourceQuicksortWrapper (data mining)Software frameworkFault-tolerant systemInterior (topology)Disk read-and-write headWeb 2.0Integrated development environmentVarianceMultiplicationService (economics)DataflowProcess (computing)Computer animation

28:57

Computer animation

Transcript: English(auto-generated)

00:03

So, my name is Stella Cotton, I also maybe destroyed my voice a little bit last night

00:21

at Ruby Karaoke, so I'm probably going to be drinking a little more water than usual, so I decided to completely rip off this gimmick that my friend Lily did in her talk on Wednesday, and, basically, any time that I take a sip of water, I would like for you to just, like, cheer and go crazy. So I'm going to start off, we're going to practice this.

00:45

You know what you're doing. Okay, cool. So, we're going to start off, we're going to get a little interactive, which is going to be funny because I'm not going to be able to see your hands, so here's the scenario. The phone, it rings, it's the middle of the night, the site is down. Every single person on your engineering team is just out of cell range, or they're

01:04

at RailsConf, and it's just you, so raise your hand if you feel like you know exactly what to do, or where to start. Okay. Alright. A couple of veterans in this audience. Alright, so close your eyes, please. I can't prove that you're doing this, but, so nobody else can judge you.

01:22

Try again, raise your hand if you feel like you know what to do, where to start. Okay, everybody's very honest in this audience. My hope is that by the end of this talk, the people who raise their hands will get some ideas for how to share with their team, and people who might not be as comfortable, ways that they can understand site availability, and for the rest of you that you'll find

01:44

ways to get comfortable yourselves. So one of the big challenges with site availability is that often it just catches us off guard. So every day we might practice refactoring, testing, a site outage can happen for a lot of really random reasons. And so this randomness is like a little scary, and I'm gonna start off by telling

02:04

my scary story. So it's July 2015, I'm working as a Ruby developer at a company called Indiegogo, it's a crowdfunding website, and it's a site where basically people come to fund things that matter to them. And we'd had a lot of successful campaigns, we had an Australian beekeeping campaign

02:20

that raised like $12 million, we had a campaign to fund the movie Super Troopers 2, and in July of 2015, the news breaks that Greece is the first developed country in the entire world to fail to make an IMF loan repayment. And through a strange course of events, this action manages to take down our entire website.

02:43

And it's in the middle of the night in California, in Europe, they're waking up, they hear this news, and they also hear this incredible news story about this really great British guy who just has this wild scheme where he's gonna end the Greek financial crisis. And he decides that he wants to help the Greek people out, so he launches a 1.6 billion

03:02

euro campaign to bail out the country of Greece. And his rationale is like, everybody in Europe throws in three euro, they're fine, they meet their goal, they bail out Greece. And so traffic just starts building, and people are contributing like really small amounts of money at super high rates. And eventually the Indiegogo website goes completely down, and it doesn't fully

03:21

recover until we put a static page up front to handle the load. And for me, this is just so unlike my day-to-day coding, deploying, even like triaging and investigating 500s, and honestly, I was pretty unprepared, and I was pretty afraid. And I wondered afterwards like, how could I have been more prepared for this?

03:43

Load testing is a way that you can programmatically simulate many users making simultaneous requests to your website. It acts as sort of a low-stress simulator into like really high-stress situations. You can play around, build your confidence, and you can kind of come up with your own site availability playbook before your disasters occur.

04:03

And as an added benefit, you can also identify some bottlenecks in your application that could be dangerous in the future. And you can also measure performance benefits of changes that you make along the way, which is really important. The downside of load testing is that when I started, because I don't come from a DevOps background, I'm just regular Rubeus, I found a lot of like high level

04:23

instruction that gave commands to just sort of kick off the load tests. And then there was a lot of like really technical instruction about site performance. But there wasn't a lot of instruction on how to bridge those two things. So it's a lot of like trial and error and frustrated Googling. So I'd like to share a couple things with you in this talk.

04:42

I want to talk about how to get started with load testing, how you can turn up the volume on your load testing to really add some load to your site. And then to use just a couple of tools to explore the results that you get.

05:01

A plus. So how do we get started? So we're gonna kick things off. We're gonna start just by preparing our load testing tool. We're gonna talk about the tool Apache Bench, because it's pre-installed on many Linux boxes. And it's just a really simple tool to get started with. So this is the command that starts with AB for Apache Bench.

05:22

And it's really like all you need to kick off your first load test. So we'll break it down just a little bit further. So you wanna choose an endpoint to send the simulated traffic to. And to start, a good idea is actually a simple static page that doesn't make any database calls, it's just a way to get a baseline.

05:40

And once you're confident that your load testing tool is actually configured correctly, you just wanna start choosing pages that will bear the brunt of traffic. So for example, Indiegogo, it's our actual campaign pages. Our homepage is not where the traffic is gonna go. But for your site, it could be the homepage, or it could be something else. And you can start by testing local hosts, if you're just kinda playing around.

06:04

But the load test itself is a resource, and it's gonna be consumed by your computer. And because it's using computer resources, it's gonna take away available resources for your web server. And it's gonna really impact your results, especially as the load starts to increase. So on the flip side, running a load test against production website

06:23

can impact your user experience or even bring down your website. So it's best to point to a staging server or a production server that doesn't host any external traffic, unless you're specifically looking to do stress testing on your production system. But if you're just trying to investigate, don't point it to your website.

06:42

And because at least one person, Lily, was thinking it, technically you're only, this is fine, you can read this. So in that same Apache Bench command that we saw earlier,

07:01

you wanna configure the traffic that you're gonna use to simulate your tests. To finish up this basic command, you need to provide two things. One is the number of requests that you want to execute concurrently, which is the C flag, and the total number of requests over the life of the load tests, and that's the N flag. So for here, we're starting with a concurrency of one.

07:21

And enough requests that the system will get time to warm up, cuz that's important. So this basically means you'll execute one concurrent request 1,000 total times. And you just wanna make sure that you're able to run the load test for a few minutes. And to define our terms a little bit, when I talk about requests, like what is a single request in this scenario?

07:42

It actually doesn't mean a single visitor to your web page, typically. Depending on the number of assets that your page is loading, or asynchronous client requests that your front-end application is going to ask from your server, your single unique visitor could actually make a lot of requests in one visit. And then on the other side, browser caching of assets actually means that

08:03

a return visitor might make even fewer requests than a new visitor. And there's another thing to keep in mind, which is that Apache Bench and server-side benchmarking tools won't actually render HTML or execute your JavaScript. So the latency times that you're seeing here are just gonna be a part of your user experience.

08:20

It's gonna be like the very baseline. So there's gonna be more delay for your users on top of this. So let's look at just an example of an Apache Bench output. Here's a snapshot of the full results, and we'll zoom in just a little bit. And zooming in, we can see that Apache Bench will show us the percentage of requests served within a certain amount of time.

08:44

So when you analyze the initial results, you wanna validate that this latency that you're seeing just matches the latency you would expect from a request in real life. So load testing is kind of a black box. If you just start plugging in random numbers and you just don't really understand your system, you can get really amazing results.

09:02

And you're just like, yeah, my site is amazing, and it's actually not real. So you wanna make sure you have a hypothesis for how you would expect the system to perform. So if you have any numbers around how your production server performs, it can just give you a ballpark for your expected request time. So for example, if you look at the line that says 99th percentile latency.

09:24

This is saying that 99% of that 1,000 results that we were, a request that we made, were served in less than 693 milliseconds. And if you have a graph from your production response times, and you're able to see that 99th percentile, and it's showing like 650, you're probably on track.

09:41

But if you're seeing like 100 milliseconds, you should be investigating an issue with your load testing setup. And a really common issue that would cause you to see like really good results in load testing and shit results in production, is that you're testing an error page. Especially if you're using a staging server for your load testing.

10:01

For example, basic auth, you're gonna need to actually add that to your Apache Bench command with the a flag. Because otherwise you're just testing how well your server returns a 400. Another common issue is hitting a 500 page or redirect. Apache Bench won't actually follow through to those redirects. So it'll just log it for you as a non 200 request.

10:24

And the easiest way to tell that you're load testing error pages is to look at your Apache Bench output. It's gonna show you non 200 requests. And if that number is already zero, even without significant load on your site, you're probably running into one of these issues. And so if you tail your server logs while you're actually running the load test,

10:43

you should see the server logging the issue. I love the enthusiasm. And it's also, there's kind of a weird thing where you need to differentiate between non 200 requests, or responses, excuse me, from failed requests.

11:02

Apache Bench will remember the content length of your very first requests. And if it changes in subsequent requests, so if you have like a dynamic session value, any other reason that your content length might change dynamically, it's gonna register these in a very mostly non-helpful failed request section. And you'll see it in your output.

11:21

Just make sure that your logs are showing that you're rendering correct pages, and just ignore it. You can also add a dash L flag in later versions of Apache Bench, and it'll accept the variable document length. So, feel pretty good. This low key, low concurrency, low test isn't running into any errors,

11:41

so we'll just start, and we'll start turning up the volume. And as we start to turn up the volume on our load tests, we can see how our app starts to be impacted by load. Let's talk first about how queuing might affect the user experience. So as we increase load, we'll start to see the average response time

12:02

of the whole site, of this page increase as well. So there's something called Little's Law, which is a fundamental law of queuing theory, and it tells us L is lambda times W, which is like the average number of customers in a stable system, is equal to the average effective arrival rate times the amount of time the customer spends in the store.

12:21

Which sounds kind of like ridiculous, whatever. But if you think about it in terms of a real world example, it's actually super intuitive. So let's think about a store where there's one cashier checking people out. The total number of customers who are waiting in that line over a specific period of time is gonna be the rate that they come in times the total time that they spend in line.

12:41

So you get a new cashier, they come on duty, they're new to the grocery store game, and they're twice as slow at checking people out as your prior cashier. And if people just keep getting in line at the same rate, the law is basically just gonna say that your line is gonna get longer, and it's gonna take longer for people to get through the line. So it's kind of intuitive. And then you can adjust that equation

13:01

to help you understand why you would see increasing response time as you add additional load. Because that mean response time, total response time, is gonna be the mean number of requests in the system divided by the throughput. So if your server is taking 500 milliseconds to process a request, and that's staying steady, which actually might not even happen at a load,

13:21

it might be going up, the total response time is gonna increase if you add more requests into your system. So a simple web application acts like a giant queue that requests are processed in. So in this example, we'll talk about a web stack that consists of a few things. So we've got a proxy server, application server, database.

13:44

So the proxy server sits behind your firewall, and it communicates back and forth with your client, AKA your users, and your web server. So a common example is gonna be HAProxy or Nginx. And then next, there's the application server. And this is gonna deal with the request

14:01

that needs processing. It's gonna make calls to your database. And in our scenario, we're gonna talk about a single-threaded server like Unicorn. And Unicorn has a master process. This is actually something that Aaron brought up in his keynote. It has a master process that has a configurable number of child processes or workers that do all of your work. So even though you have a single server

14:20

on a single machine, it handles multiple requests at once. And so it's like having multiple cashiers at your grocery store. There's other web servers like Puma, and those are gonna use multiple threads instead of multiple processes, but it's kind of a similar idea. And then in this simple stack, we only have one database. So all the cashiers at your grocery store are all making requests to the same repository.

14:43

And they can all live on the same machine, like the same physical machine, and share the same resources, or they can live on different machines. Just an example, and your website stack will probably look quite different. And so as we add more and more requests to this system, Little's Law is gonna show that this average response time is gonna increase.

15:02

And eventually, if you just add more and more requests, but you don't add any more cashiers to process the requests that are coming in, your wait is gonna grow too long, and your users are gonna see a timeout. And your proxy server will allow a client request to wait a preconfigured length of time, and it's eventually just gonna say, I'm sorry, I can't help you.

15:24

As you increase the load on your system, there might be a moment where you're like, oh, maybe I should just increase my application server queue, so it accepts more requests. But the danger here is that under extreme load, your requests can remain queued up at your application server level,

15:41

even though your proxy server has long since returned that timeout. And so it means your application server is actually gonna keep churning on those requests, but no one will actually be around to see the requests, the page rendered. And it also goes against this sort of recommendation of queuing theory, which is that single queues for multiple workers is more efficient

16:00

when job times are inconsistent. And if you think about this in practice, I give you two available web workers, they can execute jobs, and one is processing this huge file, and one is processing a little tiny file. If another request comes along and you just arbitrarily start queuing up this short request behind the worker that's downloading a huge file, your short request is gonna be

16:20

unnecessarily blocked by the long request, when another worker could have executed the request in a faster period of time. There's also another maybe strange instinct that you might feel, where you increase the timeout threshold on your proxy server, and you just do it higher and higher, so you'll decrease the error rate. But a user that doesn't see a web page load after a minute or two is gonna have

16:42

an equal or probably worse reaction than just seeing the timeout from your proxy server. It's also not just your application that gets affected by load, you'll start to see effects on your operating system, like on the operating system level

17:01

that's actually hosting your application. And so some of these configurations might be in place already on your production machines, but especially when you're bringing on a new staging server or a local machine, you'll probably find that you need to tweak them when you get started off with your load testing. A proxy server has to keep track of each incoming client request.

17:22

And it basically does this by keeping track of the IP address and the port numbers associated with each request. And because each of these requests takes up a Linux file handle, you're gonna start seeing errors potentially like too many open files. So just be sure that your operating system isn't arbitrarily capping the number of file handles

17:42

or file descriptors that your proxy server can access. Like 1024 is a common default. And since your proxy server will use one handle for an incoming request and one for outgoing connection, you actually can run up against the limit pretty quickly. You can actually see on your machine what these are for your user

18:01

that hosts your proxy server by using the ulimit command, if you're using Linux. And there's a rule of thumb for calculating this number that's given by the 2.6 kernel documentation. And it basically says that each file handle is basically 1K of memory. Don't allocate more than 10% of your available memory

18:21

to files by default. And so you get about 100 file descriptors per megabyte of RAM. And so you can start there and see if your issues go away. And then, so you wanna actually check two levels. So one is the system level. And you can edit that by editing the sys control configuration.

18:40

And that's gonna be just like the global default for the whole operating system at the higher level. But you'll also wanna make sure you adjust the user limit since that's what your proxy server is actually gonna come up against. So you'll wanna set the soft limit and the hard limit. And those need to be less than that max limit that you set for the whole system. And you save and you close your file

19:02

and you reload your changes with the sys control command. And then finally, if your proxy server has a file limit configuration, you'll just wanna make sure that you adjust that as well. This is an example for Nginx, but it'll be specific to your proxy server. Another issue that you might run into is TCP IP port exhaustion.

19:22

So there's a finite number of ports that are available on your machine. And only a subset of those ports are actually available to your application to use. And these are called ephemeral ports. They're used to handle web requests, and once the process is complete, they're gonna be released back into the system so that they can be used on your next available request. And you can tweak two settings

19:41

to increase the number of ports that are available. One, you can decrease the time weight so the port is recycled back into the system more quickly. So even though a port is not in use anymore, you're gonna see a period of time where the system holds it back. And this prevents stray packets from leaking across requests.

20:02

And you can also configure your operating system to just increase the available port range altogether. And that's gonna be different on each operating system. And another thing is when you're running your load tests, you just wanna make sure that there's a few minutes between each test, because these ports will still be in use, and you'll wanna let them recycle back.

20:22

Bless you. The Unicorn documentation has some really good suggestions for operating system tuning for these settings. But at the end of the day, your application is a very special snowflake. It's you have to think about

20:41

how your application behaves in the wild and how that's gonna affect performance in a way that isn't really being accounted for in your sterile testing environment. So one thing is the relationship between user actions, cache busting, and database queries, for example. So if you're testing an endpoint, your URL, and it returns user comments from the database.

21:02

So an important consideration is how many rows, and what's the complexity of query that you're making against that database. And if there are no comments that are seeded on your test machine, or very few comments seeded, but an expected user behavior under high load is that people are just commenting left and right. It's like the Justin Bieber website, and there's comments coming in constantly.

21:22

Like that's a different real world feeling than in your load testing environment. So even if, also if you're returning, you decided to seed a million comments, and that's on your load testing environment, you're still gonna see that query get cached after your first request.

21:41

So, oh fuck, I forgot something. So, so if in your real life scenario is that your user is getting like a thousand comments a minute and it's continuously busting the cache, you can actually simultaneously run scripts alongside your load tests, which will sort of simulate this like true effect of comment creation. Another scenario to consider

22:01

is actually blocking external requests. If you're experiencing heavy load and all of your workers are overwhelmed, any worker that's making a slow blocking HTTP request to like, example, a payment processor, it's gonna add to the overall latency experience for everyone waiting behind that request.

22:20

And so at this point, you should be really comfortable with sort of the life cycle of the web requests in your stack, what logs to look at, like where to keep an eye out for errors, and you should be confident that when you're running a load test, you're actually load testing your infrastructure and not just like coming up against the limits of your testing framework. And once you're there, you can use additional tools

22:41

to sort of understand like what are the actual bottlenecks that you're coming up against. And one place to start is to investigate the limits of the actual machines hosting your web server. So as you increase load, you can use top, really common tool when you're running your load test to view what percentage CPU and memory are being consumed overall

23:00

and who's the specific culprit might be. And one thing to keep in mind is that the percentage displayed is like a percentage of a single CPU. So in multi-core systems, your percentage can be greater than 100%. There's also something really nice called HDOP, which I really like. It's probably not pre-installed on your system, but it has just like a really great

23:21

visual representation of CPU consumption across cores and it just makes you look like a total badass comparatively. And like when you think about these resources, they're hosting your proxy server, your web server, maybe your database. It's all a zero-sum game. So if you recall from earlier, there's like a server like Unicorn.

23:40

There's a single master process that runs and you can configure sort of an arbitrary number of sub-processes to handle your web requests. And this is awesome because more workers mean that you can process more results, more requests simultaneously, but they're also consuming the resources on your physical host machine. And so you wanna make sure that you don't have so many web servers configured,

24:00

or web workers configured, that your system is just running out of physical memory and it's hitting the swap memory, which is like located on the hard drive and it's really, it's like a lot slower to access than physical memory, and you'll see that slow down. So if you look at the average memory consumption for each of those workers on your machine, if you don't have a memory leak, you should see pretty consistent memory behavior

24:20

and you can calculate how many workers that you can reasonably run on your box. And if you're running other applications on the same box, those are also gonna constrain the resources that your Ruby application can consume. So for example, if your database is on the same machine, you actually might run out of CPU resources as you increase your web workers,

24:42

long before you'll actually start to be able to test pressure on your database itself. And in real life, when site availability is compromised, you might find it's actually pretty easy to spin up more workers to handle traffic, but if they're all trying to access the same database and they're all waiting on the database, it can really, it can cause huge issues.

25:02

And so if you're wanting to investigate this scenario and how your database is behaving under pressure, you might wanna actually point it to an external database, which is pretty easy in Rails if you have a database server handy. You just configured your database.yaml to point to an external address rather than local host. And you'll also need to configure your firewall

25:22

to accept external connections, but that'll be pretty specific to your setup. But as a PSA, please don't use your production database in this scenario, because you can bring down your website. Please don't do it. I'm not responsible.

25:42

So another thing that a lot of people are probably already familiar with is using application performance monitoring tools to investigate performance issues. One of the most useful tools is just being able to trace your transactions, which is just gonna collect data on your slowest HTTP requests. New Relic and Skylight are third-party tools that typically come to mind.

26:02

But ideally, you should try to set up the ones that you use in production on your load testing server so that you can see what issues would actually show up in real life. But you can also use the gem rack-mini-profiler in production or on your load tests. But just remember, if you're running it in staging, switch the environment to run the application in production mode, so that it'll

26:20

actually profile the requests. And there's a couple tools you can use to see how load is impacting your database. It's disabled by default, but you can turn on the slow query log in MySQL to see a log of SQL statements that take more than a certain time to execute. And you can configure that timeframe to be between zero and 10 seconds.

26:43

And there's also a situation where you could be calling a query just way too often. And it may not be expensive enough to show up in the slow query log, but if you look at show process lists, you'll be able to see all the queries that are currently running. And if one query is suspiciously running frequently, it could be a bottleneck or a performance regression

27:01

that you didn't realize that you had. Especially if it's one of those special snowflake queries that we talked about earlier, where the cache is just frequently busted when it's under load. And Apache Vent is not always gonna be the best tool on the market. It's very simple, and it's available to you probably right now. But there are other tools.

27:21

Siege is another tool that allows you to use a configuration file to hit multiple endpoints at once, which is really convenient. And the wrapper bombard allows you to sort of programmatically ramp up the load on your application, which is really, really nice. There's also bees with machine guns, which is both awesome and has the most awesome name.

27:41

It's an open source tool brought to life by the news application team at the Chicago Tribune, actually. So it easily allows you to spin up a lot of micro EC2 instances to load test your web applications. If you find that just running from one box is not enough load to really make an impact on your site, it's probably because your site is a lot faster.

28:02

And then there's also flood IO, which is a paid service, but they maintain a Ruby JMeter, which is a Ruby DSL for writing up test plans for JMeter, which is a more like heavy duty load testing tool. And so your app might look pretty different from the app I've talked about today. It might be hosted on Heroku.

28:20

You might be using Puma instead of Unicorn. You might be using some kind of complex, fault-tolerant distributed system in which you have a totally different set of problems, and I'm sorry you came to this talk. But the great thing about load testing is that it's a framework for curiosity, and it gives you some tools to kind of shine light into your pretty dark and scary places in your app,

28:41

and it can be a lot of fun. So thanks, everybody. I'll tweet out a link to my slides right after this if you're interested in taking a look. You can find me on Twitter at Practice Cactus, and come up afterwards if you'd like to ask me any questions. Thanks.

Recommendations