We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

PostgreSQL on Amazon

00:00

Formal Metadata

Title
PostgreSQL on Amazon
Alternative Title
PostgreSQL on AWS
Title of Series
Number of Parts
20
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language
Producer

Content Metadata

Subject Area
Genre
Abstract
EC2 with somewhat reduced tears Amazon Web Services (AWS) has become a very popular platform for deploying PostgreSQL-backed applications. But it's not a standard hosting platform. We'll talk about how to get PostgreSQL to run efficiently and safely on AWS. Among the topics covered will be: -- Selecting an EC2 instance size, and configuring it for PostgreSQL. -- Dealing with ephemeral instance storage: What is it good for? How much do you need? -- Elastic Block Store: How much do you need? How do you configure it for best performance? -- AWS characteristics and quirks. -- Why replication is not optional on AWS. -- Backups and disaster recovery.
VideoconferencingComputer-assisted translationVarianceStiff equationDatabaseWeb serviceBitVideo gameInformation technology consultingSelectivity (electronic)InformationExpert systemDatabaseXMLUML
Scale (map)Client (computing)ArchitectureMultiplication signGoodness of fitEvent horizonScaling (geometry)Matching (graph theory)Information technology consultingClient (computing)Level (video gaming)View (database)Set (mathematics)Point cloudTerm (mathematics)Cloud computingComputer animationPanel painting
Cloud computingWeb serviceData storage devicePoint cloudTotal S.A.Surface of revolutionOperator (mathematics)Partition (number theory)Read-only memoryPhysical systemOperations researchVirtual realityKernel (computing)Term (mathematics)Internet service providerTexture mappingComputer hardwareFocus (optics)Elasticity (physics)Server (computing)Surface of revolutionNeuroinformatikTotal S.A.Cloud computingOperator (mathematics)Semiconductor memorySeries (mathematics)Web serviceHecke operatorPoint cloudTerm (mathematics)Virtual machineVirtualizationServer (computing)NumberWater vaporMultiplication signDigitizingAreaInstance (computer science)MappingData storage deviceInternet service providerData centerKernel (computing)MultiplicationSoftwareReal numberVotingJSONComputer animation
Type theoryInstance (computer science)BefehlsprozessorData storage deviceMiniDiscRead-only memoryVirtual machineSpacetimeCodecRange (statistics)CodeVirtual machineRange (statistics)CuboidType theoryInstance (computer science)MiniDiscMoment (mathematics)BefehlsprozessorDifferent (Kate Ryan album)MultilaterationData storage deviceDistanceSemiconductor memoryComputer animation
Data storage deviceInstance (computer science)Exception handlingComputer hardwareComputer programServer (computing)ComputerOperations researchKernel (computing)Level (video gaming)SoftwareMiniDiscInstance (computer science)Shared memorySoftwareLie groupVirtual machineDatabaseBitLocal ringBefehlsprozessorData storage deviceException handlingDifferent (Kate Ryan album)Server (computing)Variety (linguistics)Kernel (computing)MathematicsPurchasingSoftware bugComputer programmingMedical imagingPlanningWorld Wide Web ConsortiumMobile appInstallation artGoodness of fitVolume (thermodynamics)Arithmetic meanVirtualizationSlide ruleWeightReduction of orderPattern languageSpeciesOperator (mathematics)NeuroinformatikMachine visionCuboidPhysicalismWeb serviceComputer animation
Data storage deviceInstance (computer science)Range (statistics)Elasticity (physics)Block (periodic table)Server (computing)NP-hardNumberVolumeVirtual machineRAIDComputer networkMaxima and minimaSoftware testingInstance (computer science)SoftwareMultiplication signData structureWeb serviceServer (computing)Euklidischer RingPlotterBlock (periodic table)Projective planeRight angleBit rateHeat transferArmSystem callData storage deviceVolume (thermodynamics)Storage area networkOperator (mathematics)Term (mathematics)Client (computing)Cartesian coordinate systemDatabaseHard disk driveNumberVirtual machineKernel (computing)MultiplicationMaxima and minimaResultantHypermediaPrice indexShared memoryBootingNeuroinformatikMetreWeb 2.0World Wide Web Consortium2 (number)E-bookLevel (video gaming)Moment (mathematics)Type theoryJSON
Elasticity (physics)Block (periodic table)Multiplication signCharacteristic polynomialStatistical dispersionControl flowDatabaseData storage deviceInstance (computer science)NP-hardBootingVolumeLimit (category theory)Group actionCharacteristic polynomialRight angleCuboidBlock (periodic table)Multiplication signData storage deviceConstraint (mathematics)Instance (computer science)Hard disk driveVolume (thermodynamics)DatabaseLimit (category theory)2 (number)Data structureSoftware testingGoodness of fitConnectivity (graph theory)BootingFormal languageGame controllerBit rateVariable (mathematics)Euklidischer RingComputer animation
Video gameInstance (computer science)NP-hardVolumeData storage deviceMereologyReplication (computing)Read-only memoryUbuntu <Programm>NumberSemiconductor memoryMultiplication signKernel (computing)Fitness functionPoint (geometry)Instance (computer science)PlanningServer (computing)Source codeMaxima and minimaReplication (computing)NeuroinformatikMathematicsHard disk driveGame controllerVarianceWeightEuklidischer RingVideo gameDistribution (mathematics)Asynchronous Transfer ModeLevel (video gaming)Virtual machineVolume (thermodynamics)Set (mathematics)Data storage deviceComputer animation
BefehlsprozessorDivisorChannel capacityInstance (computer science)Read-only memorySemiconductor memoryDivisorNumberBefehlsprozessorCuboidPoint (geometry)CASE <Informatik>WordEmailInheritance (object-oriented programming)Server (computing)Instance (computer science)Axiom of choiceDecision theoryFigurate numberComputer animation
RAIDVolumeMeasurementPoint (geometry)Magnetic stripe cardPoint (geometry)SpacetimeVolume (thermodynamics)RAIDCurveParameter (computer programming)E-bookInformation technology consultingBit rateDecision theoryAverageMeasurementSemiconductor memoryMultiplicationVariable (mathematics)CircleGoodness of fitComputer programmingEuklidischer RingHypermediaArithmetic meanTheoryMultiplication signCursor (computers)Floppy diskRight angleOrder (biology)Series (mathematics)
RAIDVolumeSingle-precision floating-point formatInstance (computer science)Computer fileRAIDConsistencyVariable (mathematics)Parameter (computer programming)Volume (thermodynamics)Software testingSet (mathematics)Scripting languageBit rateSocial classPoint (geometry)Source codeMultiplication signNumberAtomic numberLine (geometry)
VolumeInstance (computer science)Magnetic stripe cardSet (mathematics)RAIDVirtual machineExt functorScheduling (computing)ComputerMilitary operationData storage deviceDatabaseCASE <Informatik>Web pageRandom numberSequenceNumberNeuroinformatikLoginDecision theoryOperator (mathematics)DatabaseError messageInformation technology consultingVolume (thermodynamics)Inheritance (object-oriented programming)Multiplication signGoodness of fitReplication (computing)Concurrency (computer science)Data recoveryVirtual machineInstance (computer science)Noise (electronics)BitSequenceNumberSound effectAnalytic setMagnetic stripe cardScheduling (computing)Characteristic polynomialServer (computing)RandomizationDifferent (Kate Ryan album)Set (mathematics)Reading (process)RAIDShape (magazine)Parameter (computer programming)Data storage deviceMiniDiscSerializabilityQuicksortDatabase normalizationSubject indexing1 (number)BootingHypermediaRegular graphEuklidischer RingDisk read-and-write headGame controllerGenderData miningVirtualizationForcing (mathematics)E-bookShared memoryLevel (video gaming)Bit rateElement (mathematics)Physical lawHomologieData structureBeat (acoustics)Beta functionPhysical systemAffine spaceCASE <Informatik>
Replication (computing)Asynchronous Transfer ModeInstance (computer science)BefehlsprozessorQuery languageQuery languageView (database)Instance (computer science)NeuroinformatikMultiplication signReplication (computing)Sound effectTerm (mathematics)Read-only memoryDatabase normalizationMessage passingStreaming mediaBefehlsprozessor
Time zoneDifferent (Kate Ryan album)Chemical affinityVirtual machineVirtual machineServer (computing)Data centerTime zoneShape (magazine)Different (Kate Ryan album)Instance (computer science)QuicksortNeighbourhood (graph theory)SummierbarkeitReplication (computing)Affine spaceMereologyComputer animation
Point (geometry)VolumeComa BerenicesData recoveryPoint (geometry)BackupDatabaseMultiplication signTable (information)Volume (thermodynamics)Computer fileStorage area networkData centerGame controllerPhysical systemScripting languageShape (magazine)MereologyInternetworkingDifferent (Kate Ryan album)Medical imagingGoodness of fitMultiplicationConnected spaceSet (mathematics)Data recoverySocial classString (computer science)WebsiteRiflingSoftware developerCivil engineeringReplication (computing)State of matterFile formatVirtual machineTransport Layer SecurityFile systemStreaming mediaGradientPerturbation theoryComputer programmingUltraviolet photoelectron spectroscopyRecurrence relationLevel (video gaming)DialectRow (database)Insertion loss
Replication (computing)MiniDiscMaxima and minimaScaling (geometry)Maxima and minimaBitMiniDiscPlug-in (computing)Control flowCASE <Informatik>Point (geometry)Replication (computing)CausalityMereologyEndliche ModelltheorieEngineering drawingProgram flowchart
Instance (computer science)Magnetic stripe cardDatabaseWorld Wide Web ConsortiumObject (grammar)Read-only memoryAnwendungsschichtServer (computing)Mobile appSet (mathematics)VolumeChannel capacityMoment (mathematics)Decision theoryIn-Memory-DatenbankSet (mathematics)Query languageWordSemiconductor memoryChannel capacityDatabaseInstance (computer science)World Wide Web ConsortiumRootService-oriented architectureMedical imagingMultiplicationPoint (geometry)RadiusHypermediaDecision theoryLine (geometry)Virtual machineReal numberWeb pageUtility softwareMoment (mathematics)WorkloadCartesian coordinate systemRadical (chemistry)CuboidReading (process)ResultantMobile appOnline helpCache (computing)Bit rateMagnetic stripe cardProcess (computing)Computer animation
Partition (number theory)Server (computing)DatabaseProcess (computing)Uniqueness quantificationInstance (computer science)Query languageData warehouseParallel portProxy serverCategory of beingVolumeWritingArchitectureConsistencySocial classWorkloadDatabaseCartesian coordinate systemServer (computing)MultiplicationPattern languageFamilyComputer architectureNatural numberDemosceneData warehouseQuery languageGroup actionDistribution (mathematics)Shared memoryBit rateRight angleLine (geometry)Proxy serverCategory of beingBitVolume (thermodynamics)Replication (computing)Centralizer and normalizerReal-time operating systemCASE <Informatik>Dynamical systemUniqueness quantificationDebuggerFormal languageIntrusion detection systemGoodness of fitRoutingWritingJSON
FingerprintArchitectureDistribution (mathematics)AutomationDatabasePrinciple of localityComputer hardwareServer (computing)Mobile appIntegrated development environmentWebsiteWeb serviceRegular graphSoftwareSoftware testingInternetworkingServer (computing)Mechanism designMobile appComputer hardwareDatabaseFront and back endsNeuroinformatikIntegrated development environmentStructural loadDimensional analysisCartesian coordinate systemExecution unitData warehouseVolume (thermodynamics)Instance (computer science)World Wide Web ConsortiumDialectMultiplicationSystem callBuildingFigurate numberFitness functionLocal ringSemiconductor memoryPoint (geometry)Video game consoleQuery languageInformationTerm (mathematics)Parameter (computer programming)NumberCASE <Informatik>Price indexRight angleHill differential equationTable (information)Decision theoryData analysisSound effectTheory of relativityHypermediaJSON
Instance (computer science)VolumeComputer configurationNumberReplication (computing)Greatest elementNetwork topologyDirected graphString (computer science)PlastikkarteServer (computing)Web serviceWorld Wide Web ConsortiumClient (computing)
Data modelChannel capacityScale (map)NP-hardPolytopServer (computing)MiniDiscProjective planeMultiplication signScaling (geometry)Endliche ModelltheorieHydraulic jumpDiscounts and allowancesChannel capacityNumberWeb serviceProgram flowchart
ArmPairwise comparisonUniform resource nameOperations researchInstance (computer science)Point cloudStaff (military)Client (computing)Data modelArchitectureOperator (mathematics)EmailNeuroinformatikPoint cloudCartesian coordinate systemEstimatorPhysical systemSocial classMiniDiscComputer architectureStaff (military)Graph (mathematics)Goodness of fitCalculationChannel capacityPoint (geometry)Arithmetic meanServer (computing)Band matrixMultiplication signComplex (psychology)Endliche ModelltheorieClient (computing)PlastikkarteSystem callRAIDScalar fieldScaling (geometry)Level (video gaming)BitInstance (computer science)Universe (mathematics)Form (programming)Right angleService-oriented architectureData managementProgram slicing1 (number)CurveQuicksortComputer animation
Server (computing)Dependent and independent variablesStatement (computer science)Web pageChannel capacityLink (knot theory)Set (mathematics)Data warehouseCASE <Informatik>Slide ruleRandomizationProduct (business)Library catalogVirtualizationSuite (music)Endliche ModelltheoriePoint (geometry)Type theoryIntegrated development environmentRAIDAcoustic shadowComputer hardwarePhysical systemOperator (mathematics)1 (number)TunisData managementBitPort scannerSemiconductor memoryRaw image formatInternet service providerSoftwareOrder (biology)Mathematical analysisVirtual machineWebsiteDevice driverQuicksortExtreme programmingBuildingPlastikkarteAndroid (robot)Process (computing)Position operatorRight angleMultiplication signCloud computingInstance (computer science)System administratorShape (magazine)Replication (computing)File systemView (database)Cartesian coordinate systemPresentation of a groupPhysical lawTime zoneDifferent (Kate Ryan album)Game controllerWordWindowBackupNeighbourhood (graph theory)Magnetic stripe cardGoodness of fitMoment (mathematics)Electric generatorOverclockingVariable (mathematics)Point cloudMereologySequenceMedical imagingDecision theoryStaff (military)Information technology consultingPole (complex analysis)ResultantElement (mathematics)AreaCore dumpLine (geometry)Data storage deviceGodWater vaporPartition (number theory)Exception handlingModal logicRevision controlClient (computing)Forcing (mathematics)Subject indexingScaling (geometry)Theory of relativityCasting (performing arts)Sound effectDatabaseExecution unitRoundness (object)Social classSystem callConcentric
Transcript: English(auto-generated)
All right. I'm Christoph Pettis. I'm a Postgres consultant with PostgreSQL experts. And I wanted to talk a little bit about running PostgreSQL on Amazon's virtual hosting services.
So first, I want to admit to a little bit of selection bias about the information you're about to receive, which is cops generally meet criminals, and doctors usually meet sick people. And database consultants are rarely called in for the people who are saying, my life is perfect. I cannot imagine a better world of Postgres on Amazon.
So generally, our contact with people running Postgres on AWS are people who are having meltdowns. So I'm sure there are, for every one I'm going to describe, there are 1,000 people who are saying, you know what? My life is a dream. This will happen three or four times when I shoot.
So as of about 2010, having dealt with a few of these customers, my opinion of Postgres on Amazon was something like this. Don't do that. You'll kill yourself. It's like, the first thing is, well, when are you moving off? Was the first thing we would talk to on each customer. And then the problem is that doesn't scale very well. It's like 65% ish of our new clients
are running on Amazon. And this is not the way you want to introduce yourself as a consultant is, that's great. Now let's replace everything. That's just what they want to hear. And in fact, a lot of them were very good matches for AWS. So we required a more nuanced view of AWS
as the avalanche of AWS customers was pouring over us. So that being said, I want to give just a quick level set. I'm sure everybody is totally familiar with cloud computing, but I wanted to just, because when I use the term, what I mean. So what is it? There are just too many definitions. I mean, do we mean computing as a service?
Do we mean this? Do we mean decentralized storage? I mean, there's iCloud, there's Dropbox, and then there's AWS. These are all cloud services. What the heck? So let's just talk first about cloud hosting. And the one thing that you will get talking to anybody is, of course, this is a total revolution in computing that has no precedent before. And so because you have quotes like this,
the underlying OS allows the operator to divide the computer up into a series of partitions, each one fixed memory size, isolated from others. This is, of course, straight from VMware's literature. Well, no, actually, it's from OS 360 MFT circa 1966. So OK, so let's talk a little more about cloud hosting. It's about dividing physical machines up
into multiple virtual machines using a hypervisor kernel. Term hypervisors, by the way, from 1965. OK, I'll stop on the old guy stuff now. And providing these virtual machines as computing resources. The hosting provider manages this mapping of the virtual machines to the physical machines,
feeds and waters the physical machines, keeps them up and running, and provides the APIs to get at them so that you can use them as if they were real individual machines. AWS, Amazon's offering in this regard, has a ton of services. Every time I log into Amazon, there's another tab up there, usually with one or two letters
and a digit for some new service. But we're going to focus on really just two. EC2, which was the original offering, which is basically the compute service, just a lot like Linode or any of these guys, and their storage area network, which is EBS.
So the Elastic Compute Cloud, which is EC2, there's a huge number of commodity servers spread across data centers worldwide, and they divide them into instances. One of the things about Amazon is they don't call anything by its proper name. You don't get a virtual host, you get an instance, which makes it seem like it's something freaky and new and different, it's a virtual host, nothing special.
There's no magic here. It's just a machine running your code. Amazon goes a long way to make it seem like it's something weird and special, it's just a virtual machine. You get a lot of different instance types. They vary from micro up to, Amazon names their instances to make them seem like t-shirt sizes.
I think QuadXL is the largest, which is too large even for me. As you go up, you get more CPU. Basically, you get a bigger hunk of the CPU that you're running on, more memory, and more instance storage, a hunk of the disk that's physically in the box you're running on.
As far as I know, actually, each of these machines isn't even rated. There's like one disk sitting in there, which will be relevant later. And it's how much of that machine you get, and there's a huge cost range. And Amazon charges by the hour. So from the moment you turn on a virtual machine and it's running to the moment you turn it off,
that's what your bill will be. Whether you're doing anything or not. So just to remind everyone of this, that you're sharing this instance with other customers. It is nearly impossible, unless you pay extra for it, to get a machine that's dedicated just to you. You will be sharing it with other customers.
They will give you, they don't lie. If they say you'll get this much CPU, instance storage, and this, you'll get it. But the IO channel to the local disk and the network are shared with everybody else on that machine. And the only guarantee they'll give you is if you pay them more money, they'll kind of put your traffic
a little bit ahead of everybody else's. That's the only guarantee on those particular resources. This is very important when running a database. There is an exception to this, which are dedicated instances. You can buy a dedicated instance and that will dedicate it to a particular customer. If you, once you provision one of your virtual machines on this particular physical box,
nobody else, except as in, that has a different Amazon account, can go on that machine. It's still virtualized. It's exactly like every other instance, it's just you're not sharing it. $7,300 per month per region for the first one. The good news is, that's a one-time fee. If you get more of these things,
you can amortize it. So if you're planning to run 1,000 of these, it's seven bucks a month per server. And the instances cost a little bit more. Actually, kind of a lot more. There's also this thing called a reserved instance. And people get these and then wonder why they're being beaten up by all these other customers
because they didn't actually read what reserved means. It's a pricing program. It's a volume purchase program. It's their frequent flyer program. It's not a technical program. It reduces the cost and guarantees you that if you commit to a particular usage pattern, your costs are lower. It doesn't change the tendency of the servers at all. Zero, so don't go by one of those
thinking that you're getting it dedicated to yourself because you're not. Be way too good a deal. Again, instances are just computers. You pick your own OS from an enormous variety of different installations. Probably between Amazon's and the community,
there's like 50 different Ubuntu images you can boot. You also debug your own kernel bugs. Amazon's providing you a bare machine. They don't wanna hear about your darn kernel bugs. You set up your own infrastructure. Now, Amazon has lots of cool tools to help you do this, but it's up to you to decide this machine's the server, this machine's the web server,
or these machines are the web server, this is my app server, this is my database server. I want everything wired together this way. Amazon provides you the tools, but they don't do this for you unless you pay them. And of course, you install and operate your own user-level software like, say, Postgres. And Amazon keeps the lights on.
So that's all about computing. Now let's talk about storage, where you put your stuff. So every machine, every instance comes with a hunk of instance storage. It varies from a very teeny little amount all the way up to, I think the biggest you can get is 3.4 terabytes of compute instances.
They used to call this ephemeral storage, which is probably a better name for it. So when Amazon is calling it ephemeral storage, believe that it is ephemeral. It survives reboots on the same instance. So if you type reboot now, when it comes back up, you should have exactly the same instance,
and you usually do, most of the time. There are a huge number of circumstances that where it can just vanish. If you ever shut down the instance, it's gone. Actually decommissioned the instance, gone. If Amazon wakes up one day and decides we want to migrate you from one machine to another, which they do all the time,
sometimes the instance storage comes with you. Sometimes it doesn't. So you really need to treat this like it's a hard drive that can die at any one moment. And the absolute most you can get out of them, and this is for a really expensive instance, is 3.4 terabytes, which admittedly is tons of storage, for most applications, but still, it's not infinite.
This is what Amazon wants you to put your database on, is EBS, Elastic Block Storage. It's a SAN over ethernet. Again, they make up their own terms. It's a SAN over ethernet. It's not, I'm told by the people who are better Amazon Kremlinologists than I am,
it's not iSCSI, but it's basic, but there's nothing super fancy about it. You can get individual volumes from a gigabyte to a terabyte. You can move them from one instance to another. So you can detach it from one instance and plug it on to another instance, which is cool.
And you can snapshot them to, oops, you can snapshot them to S3, which is their bulk storage service. S3 is, by the way, this is probably one of the few times I'll mention it, it's a service for big old honking blocks of data, but it's not a block storage device itself.
It uses more of a web API for uploading and downloading stuff, so it's not really suitable for anything like a database that needs block level access. Little more about EBS. The EBS server provides resilience against hard drive failures. Unlike the instance, a hard drive dying shouldn't take down your EBS volume. You can mount any number of EBS volumes on a single machine.
I assume there's some kernel limit, but 64 is not too many. And you can create rates using MD to create rates of multiple EBS volumes, which you may or may not want to do. We'll talk about that. And a little more runs over the network.
Every instance has a single one gigabit ethernet port in the back. Everything network related goes through that one gigabit ethernet port. Web traffic, database traffic, and the SAN traffic. Everything. So the most you could possibly get on EBS is 125 megabytes a second.
That's the theoretical maximum sustained transfer. You will never get any faster than that number no matter how many volumes you attach, no matter what a complicated rate structure you set up. That's it. That's the speed of light. And that really does happen. You push it, you push it, you push it. That's where it counts out.
Sometimes it bursts up to 130, but there's a lot of caching going on to and fro. So that probably explains that. Little more. It's not cheap. There's a reason Amazon wants you to use it. You pay for the storage itself, and for every IO operation, a teeny little meter goes click, click, click. Every time you write that wall log, yeah.
It can add up. Like one of our clients is paying $22,000 a month one instance, one instance, all IO. It's a lot of IO. So yeah, the problem is you're sharing this with lots of other people. You're sharing the instance with other customers,
sharing the network fabric between you and whatever else you're talking to with lots of other customers. And these EBS servers are really pounded on. So the result of all this is kind of slow performance. I love this quote. Performance characteristics of Amazon's elected block store are moody, technically opaque,
and at times downright confounding, and that's from the co-founder of Heroku, and a they should know, because they built their whole business on top of Amazon. EBS has good days though. It has really good days. You're cruising along, 130 megabytes a second, 20 millisecond latency,
not super, but not bad. Really low variability, like it's just ticking along like a clock. And then it has bad days. It's interesting that we are dealing with a technology that has good days and bad days. It's like, remember the days that you had to use the Freon because the component was overheating?
Like two megabytes per second throughput and two second latency. And this is not just one of these spikes. This is like the whole afternoon. It's like this. And it depends on things you cannot control. There is no button go faster to press, and there is even no throw more money into go faster slot.
So having just described all the wonderfulness as EBS, the first thought is, well, why don't we just use instance storage? I mean, you know, there it is. It's not protected against any hard drive failure. They call it ephemeral. They mean it. If the instance shuts down, you totally shut the instance down,
it goes away, guaranteed. And it's not really any faster than EBS. Amazon actually specifically says it's slower. And generally, our testing confirms this. We, our recommendation is just use it for the boot volume. As tempting as it is just sitting there,
that's, it's really not a suitable underlying structure for the database. So, but why do we care about any of this? Well, databases are all about IO, especially once the database reaches a certain size, they are largely IO constrained. This limits badly how fast you can write and for very large databases, how fast you can read.
So the unpleasant facts of life to summarize, there are certain other things about EBS, about Amazon in general you need though. Instances could just reboot at any time without warning. There it goes. Now, of course, every computer can fail. You know, we all know this, but this happens a lot, you know, but compared to how often a machine just fries.
If the hard drive fails, your instance storage is gone, and of course your machine reboots. EBS volumes, we'll talk about EBS volumes failure modes later. So just be prepared for this. Nothing you can do about it, plan for it. You know, design the resiliency into your infrastructure.
Okay, enough level setting. So let's talk about the actual, what good things you can do to make Postgres go on Amazon. So, talk about configuring your instance, configuring EBS, configuring Postgres, and about replication, because if you're running on Amazon, you're doing replication. Everybody just nod, just agree with me. It'll be much easier.
So, configuring the instance. The most important thing is how much memory you've got, because really, that's the big control. You can fit the whole DB in memory, just get a big enough instance to do it. 68 gigabytes is the biggest instance you can get out of Amazon right now, and it's been the biggest instance for quite some time.
So, EC2 doesn't change that much, actually, interestingly enough. The price keeps going down, but the maxima are still kind of been the same for years. So if you can fit it all in memory, just get enough memory, you'll use it. If you can't, just max out the memory based on your budget.
The biggest, highest memory instance that you can get out of Amazon will cost you about 1200 bucks a month. So that's the maxima of what you'll pay for just the instance, not the IO. And which distribution to use? Use Ubuntu 11.04. See, I just saved you tons of time.
Every other Linux distro, we've run into problems. Ubuntu 12 kernel freezes frequently enough to be a problem. This was confirmed by Instagram, one of our clients, and they run a lot of these servers. So they just say, use 11.4, don't even think about 12 at this point.
So that was easy. So CPU. In reality, you almost never run out of CPU on a Postgres server on Amazon. IO is almost always the limiting factor. There are specialized cases where you will run out of CPU, but generally, at the point that you crank the memory high enough,
you've given yourself enough CPU as well. Any number of people can come up with any number of pathological cases where this is not true, or good business models, I guess, another word for pathological. But generally, this is true. If it's a budget decision, always go for more memory than the CPU. Push just to get the extra CPU,
save the money from the memory. Of course, at Amazon, you don't have a lot of choices, so they come together. Generally, our experience is when you exhaust the CPU, it's because other stuff is running on the box besides Postgres, so don't do that. Don't run your JBoss server, don't run your mail server, don't run all that stuff on the box, annoying. Just give them their own instance.
That's one of the nice things about Amazon. You have to order a new super micro box. So, configuring EBS. I resist the temptation to circle the floppy and put EBS down there. Really, there's only one, the good thing about EBS is it's really easy to configure. Do you rate it or don't you?
That's it. That's the only decision you have to make. There is some folk wisdom out there. The problem with Amazon is they're very, very opaque about their technical details, so this whole homeopathy thing has sprung up around Amazon of like, well, I smeared it with this ointment and it worked better. It's like pre-zeroing the EBS volume.
It doesn't work. If you measure it enough times, it averages out. Raid 10 does not help. You don't need the raid one because it's already a raid one or probably, I think the rumor is raid five all the way over in the EBS server, or raid six.
So here's the pro-raid argument. Almost all the measurements show raid zero across multiple EBS stripes outperforming a single volume. Almost all. Some customers routinely get terrible performance off of this.
Isn't this a great technology? You know, it's like a full employment program for consultants. So it's less, the variability is less on writes than reads but it's still better. They're closer together on writes than reads but it's still better, which makes sense given that on a write it has to push it out to all the stripes.
Generally eight stripe raid zero seems to be the place that the curve starts flattening. So there's not much point in going beyond eight stripes unless you need more than eight terabytes of memory of EBS space at which point throw more stripes on. So the anti-raid argument.
One of the nice things about it is you can snapshot an EBS volume test three. Push a button makes an atomic copy. Doesn't handle the Postgres side of it but the underlying files get moved over. But if you have a stripe, you do that one at a time and you've just lost the consistency. So you actually have to quiesce the volume
which usually means taking Postgres down. So that's kind of annoying. If you have 64 instances remounting them all on a new, or 64 EBS volumes, remounting them all and rebuilding the raid on a new instance is kind of a pain in the neck. You definitely want to script that because you don't want to get it wrong. And the problem is you just added
all of these variables together and it only performs as well as the slowest member. So you have even more variability in your EBS volume which isn't kind of what anyone wants. And since you now have more volumes but one failure takes down the whole set
because it's raid zero, you just increase your chance of losing your data to an EBS failure. It's like, wait, wait, what? EBS volumes can fail. We have one particularly unlucky client. This has happened to three times this year. Yeah, it happens.
Generally the way you find out about this is it reboots. It tries to mount the EBS volume and Amazon says, what EBS volume? Oh, we stopped building it for that EBS volume that doesn't exist. So we'll give you a refund. There'll be all four dollars of it.
And if one strike fails, the whole set's useless. Well, that's exciting. So you have to plan for this. It's just like a failure on a local raid zero cluster. You just have to plan for it. So, tips and tricks. Just use XFS.
There's really no reason not to. Although really, as long as you're not using ext3, you're okay. Even ext2 is better. Set the read ahead to 64K. Set the chunk size on raid to 256K. And use the deadline scheduler or CFQR.
No, the difference between these is actually pretty small. Deadline peaks ahead a tiny bit, but again, lots of noise in this data. But that's kind of it. Those are all the numbers you need to know. Okay, well, we have our instance so that we've sized the database and we have our EBS volumes and they're all mounted and humming along
and you're watching the costs tick up. So you better get your database up quickly. Again, instances are just virtual computers. Nothing special. Anything you would do to tune Postgres normally, just do it here too. There's nothing super magic about one of these. This obviously can't be a full Postgres tuning talk.
Check out Josh Berkus's Five Steps to Postgres Performance talk. He's a coworker of mine, so of course I'm gonna plug his talk there, but there's lots of good Postgres tuning resources. So do all that stuff, definitely. But the basics specifically to EBS or to Amazon. Only run Postgres on the instance.
Just put all PG data onto the EBS volume, striped or not, depending on what your decision is on that regard. It's fine to put the operation logs, the text logs, error logs. Do those have an official name? I've never really known, on the instance storage. But PGX log, put it on the same EBS volume
as the rest of the databases, our recommendation, which is exactly contrary to the normal advice that you'll get if you're running on your own machine. One of the first things that any consultant will tell you to do is move the X logs to their own set of disks because the C characteristics of the main database volume and the X log volume are completely different.
One's just going, writing data, and the other one's moving all around doing indexes and all sorts of crazy stuff. The problem is, you're sharing this server with half the country. You cannot optimize the seeks. It's like, you have that tiny little slot that's left over when Reddit is running on the entire rest of the machine.
Somehow you're not going to be able to predict that you can get a good sequential read off of that EBS volume. So don't even try. And anyway, if you lose the EBS volume database is toast. So there's really no extra redundancy that you're getting by moving the X logs to someplace else.
Don't put it on instant storage. That's a very tempting bad idea because you've just registered the database recoverable if the instance fails. If the instance has not failed, if the instance fails but your EBS volume is okay, you can reattach to a new volume, bring it up, Postgres will dutifully enter recovery and you'll be in good shape.
So now it's time for the hardcore, what parameters to set in your postgresql.conf. There are two. Set random teach us to 1.1. Why 1.1? Beats me, it seems to work out. There's actually some, there is a little bit of more analytical knowledge behind this.
The problem is that you can't, once again, you can't control the seek behavior on EBS. Random reads and sequential reads on EBS are lost in the noise as to which is faster or slower. So given that, sequential reads can be a little bit faster
if the EBS server is otherwise unoccupied for the reason that anyone is. So 1.1 seems to be better than 1.0. But this, I will admit there's a certain amount of voodoo baked into that number. Effective IO concurrency. If you're doing striped raid, set it to the number of stripes.
If you're not, leave it alone. Great, you've just tuned Postgres for Amazon. Pretty cool. Now it's time to set up replication. If you're running Postgres on AWS, you need to be doing streaming replication or traditional wall shipping. Just do it.
The problem is, as I have described to tedious detail before, lots of things break all the time on Amazon. They call them, they're using terms like ephemeral and things like that. The message they are trying to give you is these things are not these super expensive computers that are hardened and with redundant everything.
Things just go away all the time. So generally what you want to do is stream replication from one instance to another. The second instance doesn't have to be as capable as the first instance. Generally it doesn't need as much CPU
because mainly what it's doing is keeping up with the primary. Unless you're actually, if you're using it for queries of course, using it for read-only queries, then you want to size it appropriately for the CPU usage there. Amazon has this thing called availability zones which are kind of sort of like data centers. It's unclear exactly what they,
or their parts of data centers. Again, they can't give anything its proper name. But the reason these are important is you have to put the replica in a different availability zone. You have to, you have to. Don't argue, just do it. The reason is Amazon appears to have customer affinity
for machines in the same availability zone. So if you create two instances on the same availability zone, easy for you to say, it will generally provision that instance on the same physical machine. The problem should be immediately obvious of putting the replica there. It's the, putting it in a different availability zone is the only way of guaranteeing
that it'll be on a different physical machine. By the way, there is a trick, another trick which is if you're getting pounded up and your instance is performing really terribly, try creating a new instance in a different availability zone because you may just be in a bad neighborhood. Again, you may be like, oh look, there's a slot available on Reddit server.
So EBS snapshotting, one of the coolest things about Amazon. Because dealing with SANS and SANS snapshotting, it's kind of a pain in the neck. You're like down in the manual trying to figure out you have to install this program and do all this stuff and this, it's like, run the command line. Please snapshot it. Bing, you're done, it's great.
It's specifically great for doing point in time backups. Because one of the extremely cool things about Postgres, I spend more time explaining to people over and over again that for point in time backup, the underlying file system image does not have to be perfectly consistent. You don't have to tell Postgres to stop writing to the data.
They just don't believe it. It's so cool. So doing a SANS-style snapshot for doing the base backup of point in time backup is perfectly reasonable. Just make sure you're saving the wall segments also. Another good reason to have them on the same volume. And there's actually, Heroku has this guy, Wall-E,
that's a set of Python scripts that do this for you on Amazon. It's super cool. Saves all the stuff to you in S3. Very neat. One button, just push a button, you've got a point in time backup. Move it to a different machine, mount it, re-provision it. Makes it really easy to create string replicas slaves. So disaster recovery.
So yeah, if the giant planetoid hits Earth, people can still buy tickets off your site. Put a warm standby in a different region. Amazon, so you know for sure it will be in a different data center with largely different connectivity requirements. Because ultimately Amazon is running on the same public internet we all are at certain levels of periphery.
If that public internet starts having indigestion or a meteorite destroys Virginia or something, you're in good shape. One of the nice parts about this is it's a warm standby rather than a streaming replica. You can allow for point in time recovery by keeping the wall segments and multiple base backups. Which is very handy because
hot standby is great for a large class of failures. Oops, I dropped that table. Is not one of the class of failures that it's good for. Because it'll push that drop right across the streaming connection just like anything else. And before you have a chance to say control C. So generally you say like you can keep two to four backup snapshots. I mean if your database is gigantic,
you know you're gonna start having, these things are gonna be big. Maybe you do two to four backups a week. Again you're using the snapshotting, it's really cool. Okay, so if you're doing replication, monitor, monitor, monitor. Monitor everything that's going on. Because it's really easy to fill up disks
with replication if something goes sprawling. Because wall segments start piling up somewhere. You know it's a very common scenario that the secondary goes down. The wall shipping keeps going. But it's not cleaning up these wall segments on the secondary anymore. So eventually that disk fills up. The, so and then the master goes huh,
I'm not able to successfully push these wall segments across doing wall shipping anymore. So I better keep on to them in case the secondary gets up. It comes up and then this fills up. And that's kind of a naughty problem. You know that's a little bit of a tedious problem to recover from. So monitor disk usage at the minimum.
If you aren't already familiar with it, check Postgres.pl on bucardo.org. It's a Nagios compatible plugin for doing all this stuff. It can monitor things that you would never in a million years want to monitor. But it can monitor a lot of useful stuff also. Okay, so quick break for, I'm gonna take a quick breath and questions at this point.
I'm either completely missing the boat or overwhelming you with data. Well okay, yeah I love being in this session. But okay, I'll talk louder. So scaling, like a fish. So you know but you've done all this stuff and you've tuned it and you did it and you,
the high memory quadruple extra large instances, this is a real instance name by the way, but with its eight stripe rate EBS mount and you've run out of horsepower. Okay, now what? So first, most scaling issues are really application issues first. You really haven't driven Postgres to the wall.
There are things you can do on your application. Like if there are things in the database that don't need to be there, please take them out, like web sessions. Move as much read traffic as you can onto your streaming replicas. That's what they're there for. They do a great job of that stuff. Memory's cheap, so just throw more memory on the box if you haven't maxed it out already.
You can aim for a shared nothing application layer because then you can provision and terminate app instances as you need to without a whole lot of folder all. One of the nice things about Amazon, one of the best things about it is it makes it really easy to spin up and spin down these instances.
You create your own AMI, it's a machine image, a saved machine image, and you say, oh, I need a new app server, bonk, bring it back up. Companies like RightScale and Heroku, this is basically their business. Heroku does a lot more than this, but this is one of the cool things about Amazon is you can bring up and tear down these instances, so use that.
As much as you can digest and cache stuff, page fragments, result sets from common queries, all this kind of stuff, use Redis for web sessions and all this kind of stuff. It's an in-memory database that'll handle this kind of stuff, it's what it's there for, does great. And you do all of this stuff
and then you still smash into the wall. So generally where it happens is you run out of write capacity on your main database. It just can't keep up anymore. And because it's the main database, there's no place to go, the secondaries don't help you here. Generally what will happen is, depending on your workload, at a peak moment,
suddenly all the needles go into the red and it's taking longer than you can allow. So unfortunately at this point, if you're staying on Amazon, well, if you're doing anything, you need to make some tough decisions. So what everyone wants to do is sharding. Everyone loves sharding. It's the sharding.
Which is you partition the database across multiple database servers. You isolate the stuff that you can so that each database has a portion of your total data and then duplicate for consistency purposes the rest of the smaller stuff across it. It's great for a particular class of workload, which I would say is where the workload's
proportional to a small atom of business, just like a user or a like or something like that. Those tend to be more shardable kinds of applications. Sharding has lots of fun challenges, like keeping IDs unique. There's really good paper from Instagram about their method for how they kept IDs unique
in their sharded architecture. How you route work to the right database, because now you have 64 of these databases and you need to figure out which one you need to query to do something. When you have shared data, you want to push it, like say you're Groupon, this is completely speculative.
I have no insight into Groupon's architecture whatsoever. You can imagine the deals are kind of all spread out across all the databases and the users are sharded. But so you need to distribute the deals across all the databases, so you need an architecture to do that. And when one of these fails, what do you do? Also, for reporting, you are almost always
going to have to do consolidate queries across all these databases. And how do you do that without suddenly crushing yourself, just like you tried to avoid by sharding in the first place? So let's talk a little more about that. Some of the solutions you can do is, well, let's export everything to a central data warehouse, because presumably this data warehouse
can take a little longer to run one of these ginormous queries than the things that are actually handling front end requests. You can just go ahead and distribute, or you can go ahead and distribute the queries. Just send an individual query, pull everything back in. PLProxy is really cool for this. If you're not familiar with PLProxy and Skype tools,
it's a mini embedded language for Postgres for doing this kind of distributed querying. And then you can build your own aggregation on top of it. It's really neat. Check it out. Sharding is not for anyone, for everyone. And there are at least two categories in my mind
that really are not great for this. Your traditional data warehouse, hard to shard because the sharding pattern is completely non-obvious. You come up with this great sharding pattern and then the next query needs a different one. And it's really hard to come up with a totally consistent sharding pattern that will help you.
Some cases you can, but not all of them. In applications that do extremely high write volumes, like sensor data, things like that, where there's this huge inbound traffic and the real-time applications where the database has to keep up, sometimes you can make sharding work if again there's kind of the right distribution pattern.
But these run out of steam on Amazon very, very fast because of the latency and throughput problems of EBS. And again, sharding is very cool right now and it's a really neat architecture. And if you can build your application this way from zero, you will be very happy you did. But if it's not the natural application architecture,
don't deform it just to shard because of Amazon. So if you're building from zero and you really want to, and you think, well, I might throw a switch and a million people will show up, that would be cool. What should I do? Design it for sharding.
Just assume that you're going to have multiple databases and figure out all these problems right from zero. Every instance is disposable. Remember the good old days we would buy servers and they'd all come up and there'd be a stack of seven and we'd name them after the planets or something like that. And we'd say, this is Saturn and it's my friend and all that.
These are Kleenex, throw them away, use them, get rid of them, give them numbers, call the turkey entree, just don't do that. Don't think of them the way we did before. You'll be much happier on Amazon if you can think of them as we need more compute, go trim compute dial up, clicks up in units of an instance.
The thing that is cool about Amazon and where they are still to this day, I think ahead of everyone else in this arena is they've really cool APIs. Everything has a cool API. Mechanical Turk has a very cool API. So you can have your computer tell people to do things. It's pretty neat.
I would say a load testing API using Mechanical Turk would be kind of a neat thing to build. Just send out this Mechanical Turk test. Everybody go pound on my website now please. Probably be expensive but it'd be neat. But provisioning a new server, moving ABS volumes around, all that kind of stuff, it all has the APIs. In fact, you can do more with the APIs than you can do with their console. So use that and automate everything.
When you want to create a new app server, that should be one button. I need a new app server please. So at this point the answer is okay, this is a ton of information but what do I do? How do I make this decision? And so I'm gonna try and distill this down. This is the reaction I generally get when I go through this and say,
but should I run on Amazon or not? Okay, here's the summary. Yes or no. Generally, if you have a small database, your application is not particularly write critical. It's not being swamped by writes. There's a lot of locality of reference in the database, even if it's a huge one.
So you're not doing queries across everything. You can partition the tables or do something. So generally you will have a small working set in memory or your application's shardable. That tends to indicate a yes for Amazon. If you have a large database, and by large I mean over a terabyte in this case, it's write critical.
So you have to keep up with a large real-time volume. You generally don't have locality of reference. When you do data analysis, it's across the whole darn thing of this one terabyte database. So I always going to be extremely important. And your application's really not that easily shardable. That tends to indicate a no for Amazon.
Traditionally, your basic web OLTP application tends to be not a bad fit for Amazon. There's a reason it's called Amazon Web Services. Data Warehouse tends to not be as good a fit. But again, this is like two axis across the 25 axis dimension of possible database applications.
So you need to judge it for yourself. Or, you know, no reason everything has to be on Amazon. You can develop on AWS and deploy on traditional hardware as long as your deployment and development environments are reasonably close. Amazon's great for this kind of thing because you need to destroy the instance when you go home at night so you're not paying for it.
You can put the Premier Web Services and the Data Warehouse, this back-end Data Warehouse on traditional hardware. However, by this I don't mean run the database that these web services are talking to. To do it because the network traffic, you'd have to have this big hop across the public internet to the database
and that would probably kill you. Again, specialized applications maybe. But for example, the web facing servers could collect all this data and pump it up to the Data Warehouse regularly. So provide that. There is another option. People like this option. I don't, maybe I'm just like too old and in the way and I'll get to love it eventually. I learned to love the bomb here.
But you turn off all the Postgres safety features. Fsync off, that implies synchronous commit off. Just, you don't care. You rely just on streaming replication to preserve your data. So both the master and the slave, Fsync off and you figure how bad could it be really?
You treat everything as disposable. If one goes down, it's like yeah well, it's like yeah well, things happen, you know. And just hope this number's worked out in your favor. We do not recommend this. And above all, avoid Amazon Stockholm Syndrome.
This is where I've gone into clients where it's like I will give this whole thing like you're spending $22,000 a month. You could buy a gigantic server on your credit card and hire someone to run it and save $10,000 a month. And they say well, yes, but how do we fix
this problem on Amazon? Okay, I'm paid by the hour. No one cares. No one comes and says you know, I'm going to buy this laser pointer from you because you run on Amazon's web services. No one cares.
If you're not getting what you need from the move, it's just a resource. And because remember, it is just about cost. The traditional cost model that we all grew up with was this is really high buy-in. You write this giant check in this giant server
with all the spinny disks arrives and then you have to figure out a place to put it. The cost is in bumps and jumps. There's this big upfront cost. You have to hire someone and that burns a lot of money and all that kind of stuff. It's really hard to scale this kind of stuff on demand because you know, oh, shit, new server. But there are a lot of economies of scale of this model
because once you've hired your first DBA, you probably don't need a second one for quite some time. The new hip and with it Amazon cost model basically starts at zero. You sign up for AWS, that costs you anything, yeah. It pretty much, the more capacity you use, the more you take.
It's very easy provision up and down. Everyone goes home, you're a winter, you're selling tickets, ski lift tickets, during the summer, turn the number down, very easy. The problem is there's no economies of scale. Amazon Web Services has huge economies of scale. Those all go to Amazon, not you.
The fact that you can get discounts on them, that's not economies of scale. That's just the discount. So here's my graph that explains it all. The most oversimplified graph in the history of computing. So we have capacity and cost. Basically, the traditional model kind of goes up and then suddenly, oh shit, new server bonk.
New server bonk. You know, there's a little bit of a rise because of bandwidth. Amazon just, but eventually they will cross. You may have to, it may be like right here, depending on your application. It may be like, you know, way over there in Quebec, but they will cross eventually as capacity goes up because eventually the economies of scale
will work to your favor, not Amazon's. So remember, when you're costing this, bandwidth is extra on Amazon, as it is everywhere. And IO operations are extra. This is what can really kill you on Amazon because once you've bought your server,
HP does not come in and send you a bill for every time that you use their RAID card to write to the disks you bought. Amazon does. And these can easily swap the instance cost. You know, you use the calculator and you come up with to say, oh well, our instance is about 300 bucks a month. That's not bad, that's great. And then you start running it and your first bill come,
and then you get like two days later, you get one of these emails from Amazon saying, we've noticed your usage is slightly higher. Hope the $8,300 bill you're about to get is okay. Be sure to include, and so just include these in your estimate. And a note about staffing because this is one of the places that the universal thing is, but we don't want to hire dev staff, op staff.
Cloud hosting doesn't mean no operation staff. Remember, they're selling you computers. You can defer cloud hosting farther out, but eventually you will need people. Eventually your system will reach a level of complexity.
You will need someone to help manage this for you. At that point, you might pick up the phone and call Heroku and say, you know, help, because this is one of the things they do. They push that point way farther out on the curve for you, but eventually you will. Just as a, you know, take this for what you will, but every one of our large AWS clients has at some point had to hire op staff.
Generally just one, but they've had to hire them because they just couldn't do it anymore, and they were getting tired of writing us checks. So okay, and here we are. If AWS is a great solution, I really like Amazon for the things it's good at, and it's good for a very large class of application.
If it's a good fit, use the sucker. Use the APIs. Use its ability to spin up and spin down instances. You know, talk to people like RightScale and Scalar and Heroku and really bang on it, because it's got a lot of neat stuff associated with it, but don't deform your whole technical architecture just because you think Amazon's the only way to go,
and cost it out very carefully. Do these costs. It will not always be cheaper on Amazon, and there we are. Thank you, and there I am. Business website.
That's PGX's website and my personal website, and the slides will be up there. Yes, sir. Yeah, that slide could be a little clearer. Let me explain what I mean. Suppose you are, you know, or you sell goodies,
and you have the order management software runs entirely on Amazon, so you keep the orders, you keep everything there, because there's an obvious sharding model for that. By region, by product line, you know, by whatever you want by user. You know, users one through a million are on this server, and you keep all the data isolated.
You may have to push your product catalog out to all the servers, but the product catalog doesn't grow nearly as fast as your user base, you hope, but at some point you need to set, you need to do your tax return for the year, you know, and so you need all this data. So my suggestion there is suck all the data down as it becomes fixed to a data warehouse that's in-house,
that's the machine sitting under your desk, you know, probably not literally, but that's more suited for doing those kinds of big data IO-bound operations. That was what, that slide could use some tweaking, though you're absolutely right. Sir. Do you know which of your recommendations
on building, you know, WordPress, building it, of course, but also to reach all machines, not on the data side, but just like virtual? Well, the, let's see. Generally, virtual machines have worse IO than non-virtualized machines,
and the main reason for that is there, now two people have to write good drivers, the VM container writers and the device writers, and it's hard enough for the people who built the RAID card to write a good driver. We've discovered it can be really hard for two people to write good drivers, but that's a problem, not a tuning issue, you know,
is you just have to be aware of that, that the IO is gonna be bad. Generally, if you are sharing the SAN, you probably want random page cost to be lower, because basically what random page cost is, how willing Postgres is to do random IO operations
versus sequential operations, like an index scan versus a sequential scan. Generally, you want to set it reasonably high. However, if you're running on SSDs, for example, you probably want to set it lower, because random IO is cheaper.
So, again, the best tuning you can do on virtual machines is, first of all, generally you want to run a hypervisor style thing rather than a jails type thing. In my experience, the jails style virtualization is lower performance. That's a really broad sweeping statement that allows for many exceptions, but there it is.
And always buy more memory, because the memory isn't shared. Once you've got the memory, you've got the memory. And give yourself as much memory as you generally want to do. There's lots of neat stuff about running virtualized machines. You can move images around, you can do all this stuff.
There are a lot of really good system management reasons to run virtualization, even if you own the hardware. But you do want to be aware that it will hurt your IO somewhat. And how much depends on the relative quality of the drivers in the stack, sadly. Yes, yes sir.
Generally, the main reason that XFS appears to perform well is it handles journaling more efficiently than any of the other, than EXT3 in particular. Again, it's really more of the next generation
of file system beyond EXT3. You have, there's basically three. There's JFS, XFS, and EXT4. Our experience is XFS is a little bit more developed than the others. And a file system is something you really don't want to get wrong.
So we're hyper conservative about that. That is not, there's no slander directed to EXT4 particularly. People run with it, it seems to run great. The thing you don't want to run is EXT3 because journaling is very slow.
Mm-hmm. Is that based more? A little of both. Certainly you, it's mostly from a performance point of view but the less I owe you due to EBS, the more money you'll save. So, I mean that's good advice on systems no matter
of any kind of system. But it's particularly applicable to Amazon where the IO story is so nuanced. This is the right word. Yes, Bruce. I'm starting to go back to labor.
So, I guess looking at Sobering and IO, well I've heard some of these clearly. So, the question is, I guess if I'm looking
at a user who's considering Amazon, I guess the big, me in my head, and I'm trying to understand why I'm wrong, the big win is the ability to effectively scale up the instances you're looking at. Correct, yeah. When you start to go over a lot of the sort of nitty-gritty details that us as big, big people are playing with all the time, like shared IO,
calls, and just sort of a lot of the headaches, is the ability to, every problem that ops people have is to scale, draw, and just multiply it by 10,000. You know, if one is of a certain age,
one would say, oh great, we just spent the entire 90s making hardware reliable, and now we're spending the entire rest of them building these unreliable instances on top of this reliable, all this hardware, we spend all this work, right? But you know, and one could be forgiven for that, that point of view. Again, it really depends on your win. I'm gonna give sort of the tale of two clients,
one of whom I can name, one of whom I can't. Instagram is the one I can name. They've done a really good job. And you know, when I first walked in, it was one of those skull and crossbones thing, I said, oh my God, it's another one of these Amazon things. And then it's like, okay, these guys actually have figured it out. Because they have a very sensible partitioning model.
They were, as a startup, they were not in a position to say, okay, here's the $1.2 million check for our data center. And now, of course, that's no longer an issue, but who knows what that path would have been? And they were getting really good use out of that model because they could just push a button and say,
well look, a great example is when they dropped the, when they shipped the Android. Suddenly, a million people signed up that day. Either they would have had to make a huge upfront capital investment and hope they'd gotten it right, or they just kept pushing the more stuff button, please,
for, until they reached a capacity. They did a great job. So that's the people where it's like, yeah, okay. They fought it through. Another client, I can't name the client, but I can give you the details. They're basically a legal research company. That's what they do. People come to them and say, we have this really complicated, basically, are we going to get sued for doing this?
And they have to dig through, and will we get sued anywhere in the world for doing this? They have this enormous database, and they have analysts who sit there and run ad hoc queries on it. It's on Amazon. And these are the people who are paying $22,000 a month just for IO because they have this team of legal analysis
who are running these, who are typing raw SQL into pgAdmin3 and pressing, you know, hitting enter and getting results back. And it makes no sense. You know, they're, you know, we've made our recommendations, and it's their, you know, we're technical consultants, not business consultants, so it's up to them.
And so those are like the two opposite poles for them. And you know, they're a law firm. They're made of money, you know? They could buy the server if they wanted to. But they made this decision, and I think, you know, for, I believe this is a case of Amazon Stockholm syndrome, where they just feel locked into it. You know, they've like bonded with it for this.
And the main takeaway I want from this presentation is it really is just a set of tools with very particular, with, you know, this relatively long column of virtues and this relatively long column of faults. And you have to, if your particular application
really plays to the virtues and can kind of cruise over the faults, you're in great shape. But, and that's a wide class of problem, but it's not every problem. So, yes?
Oh, I'm sure, I mean, I know people who, you know, I know people who put, there's at least one company who puts $100,000 a month to Amazon on his Amex.
He has lots of miles. But he doesn't pay for airline tickets very often. You know, and again, and in that particular case, again, somebody I can't name, I apologize, but it kind of makes sense for what they're doing because they need this kind of, you know, big throttle in the control room for,
we need a lot, you know, they would probably be spending a lot more than that if they had the dedicated rack of servers running. But, yeah, I mean, there's,
as somebody in a startup said to me, Amazon is great for getting money, it's not great for making money. You know, they say if you're going out, if you haven't gotten your first VC round, Amazon makes a ton of sense unless somehow, but because the capital expense is a,
you know, there's zero capital expense. However, once it's time to start really making, you know, cranking, it may not make a ton of sense. So, any other questions? Yes, sir.
Right. You know, Amazon is probably the most extreme example of taking, sort of taking it to its logical conclusion. Where other cloud providers tend to, hosting providers is
It tends to be a little bit more hoo-ha to spin up an instance, but you tend to get more out of the instance. The instance doesn't have as much performance variability. The hardware tends to be more dedicated to you, or you can arrange that if you need it to. You tend to have better guarantees about how over-committed or over-committed that particular hardware is.
Their billing model tends to be in larger lumps than Amazon. It tends not to be hourly. It tends to be monthly or something. So far, no one's quite gotten the API situation as nailed down, in my experience, as Amazon. Of course, this could have changed while we were talking here.
This is a very fast-moving part of the world. Obviously, Amazon's the one everyone wants to be. I'm sure that next week there'll be some new interesting offering that may address some of these points. This is specifically about going to Amazon, because they cast a very long shadow over the whole cloud hosting environment.
Yes, sir? Well, they don't have a built-in technology that does it, but they do offer a suite of technologies that, taken together, support it very nicely.
If you go to that Wall-E set of tools, for example, that's a very nice set of tools for starting replication and that kind of thing. They do support backups of your individual machines to S3, to their bulk storage,
but there's nothing specifically Postgres-y about those. You're responsible for making that work with Postgres. Yes, sir?
Well, the problem is it's a little hard to do with MDs, kind of the weak link there. MD isn't wild about you dropping a unit out of a stripe like that, so you'd probably have to do a pure RAID 10, which is kind of expensive.
There are people who do this, who do exactly this. There's a lot of discussion about, well, I get 20 instances and I test them and I use the best five. It's sort of like overclockers. I buy 20 chips and I run them all hot and see which one burns out. The problem is that's true at the moment you do that, but again, some hot startup may just move in.
There's no protective covenants on your neighborhood. People can move in and put up aluminum siding if they want. You can move in and the next door neighbor suddenly throws a loud party to overstretch the metaphor.
We're all engineers. We want this illusion of control. You have to let go of that. Amazon is the one who figures this stuff out. You don't get to pick. There are times we've advised clients, look, just fire up a new instance of a different availability zone.
People are knocking the windows out and painting graffiti on yours. Thank you very much.