Shootout at the PAAS Corral

Video thumbnail (Frame 0) Video thumbnail (Frame 2562) Video thumbnail (Frame 4032) Video thumbnail (Frame 5363) Video thumbnail (Frame 6785) Video thumbnail (Frame 8281) Video thumbnail (Frame 9600) Video thumbnail (Frame 14160) Video thumbnail (Frame 18461) Video thumbnail (Frame 20147) Video thumbnail (Frame 22490) Video thumbnail (Frame 23966) Video thumbnail (Frame 26029) Video thumbnail (Frame 27747) Video thumbnail (Frame 30465) Video thumbnail (Frame 32066) Video thumbnail (Frame 33378) Video thumbnail (Frame 34498) Video thumbnail (Frame 35952) Video thumbnail (Frame 37291) Video thumbnail (Frame 39970) Video thumbnail (Frame 41392) Video thumbnail (Frame 42697) Video thumbnail (Frame 44454) Video thumbnail (Frame 45677) Video thumbnail (Frame 49596) Video thumbnail (Frame 51628) Video thumbnail (Frame 54808) Video thumbnail (Frame 56299) Video thumbnail (Frame 57405) Video thumbnail (Frame 60215) Video thumbnail (Frame 61939) Video thumbnail (Frame 63513) Video thumbnail (Frame 67337) Video thumbnail (Frame 69665) Video thumbnail (Frame 70754) Video thumbnail (Frame 71906) Video thumbnail (Frame 73846) Video thumbnail (Frame 74985) Video thumbnail (Frame 76130) Video thumbnail (Frame 77317)
Video in TIB AV-Portal: Shootout at the PAAS Corral

Formal Metadata

Shootout at the PAAS Corral
Title of Series
Number of Parts
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date
Production Place
Ottawa, Canada

Content Metadata

Subject Area
Shootout at the PAAS Corral head-to-head for PostgreSQL cloud platforms Where should you run your PostgreSQL in the cloud? Join us for a comparison of features, pricing and performance between various cloud options, including most or all of EC2, Amazon RDS, Heroku, OpenShift, Google Compute, and the Rackspace Cloud. To determine which cloud is the fastest, cheapest and best, over the next few months Josh Berkus and others will be running a series of performance benchmarks against several of the many cloud hosting options available for PostgreSQL. This will include most or all of EC2, Amazon RDS, Heroku, OpenShift, Google Compute, and the Rackspace Cloud. The results will be presented to you in this talk, including: Benchmarking methdology Cost comparison for each configuration Feature differences Performance scores
Service (economics) Arm Different (Kate Ryan album) Internet service provider Multiplication sign Point cloud Cloud computing Bit Client (computing) Information technology consulting
Server (computing) Group action Multiplication sign Digitizing Content (media) Cloud computing Cloud computing First-person shooter Data management Computer animation Different (Kate Ryan album) Term (mathematics) Internet service provider Point cloud Right angle World Wide Web Consortium
Area Point (geometry) Enterprise architecture Service (economics) Closed set Multiplication sign Cloud computing Database Cloud computing Grand Unified Theory Drop (liquid) Web service Data management Computer animation Computer configuration Internet service provider
Wechselseitige Information Distribution (mathematics) Mountain pass Sheaf (mathematics) Cloud computing Special unitary group Java remote method invocation Dirac equation Data management Web service Number theory DDR SDRAM Elasticity (physics) Arc (geometry) Formal grammar Scalable Coherent Interface Metropolitan area network Service (economics) Domain name Mass Cloud computing Instance (computer science) Flow separation Pi Uniform resource name Dew point Right angle Logic gate Wide area network Server (computing) MUD Mapping Service (economics) Artificial neural network Maxima and minima Data storage device Plastikkarte Discrete element method Value-added network Emulation Hexagon Summierbarkeit Computing platform Raw image format Distribution (mathematics) Dataflow Sine Code Coma Berenices Euler angles Scalability Sign (mathematics) Computer animation Personal digital assistant Large eddy simulation Web service Point cloud Identity management Elasticity (physics) Cloning
Installation art Pairwise comparison Service (economics) System administrator Cloud computing Database Bit Instance (computer science) Instance (computer science) Revision control Computer configuration Computer animation Computer configuration Revision control Computing platform Point cloud Right angle Software framework Quicksort Extension (kinesiology) Computing platform Data type Extension (kinesiology)
Axiom of choice Multiplication sign Equaliser (mathematics) 1 (number) Database Instance (computer science) Mereology Order of magnitude Neuroinformatik Mathematics Bit rate Strategy game Insertion loss Semiconductor memory Different (Kate Ryan album) Computer configuration Befehlsprozessor Elasticity (physics) Series (mathematics) Social class Block (periodic table) Structural load Data storage device Cloud computing Hecke operator Instance (computer science) Data warehouse Entire function Befehlsprozessor Computer configuration Hard disk drive Right angle Quicksort Resultant Point (geometry) Classical physics Slide rule Backup Service (economics) Data recovery Virtual machine Data storage device Event horizon Number Cache (computing) Term (mathematics) Operator (mathematics) Computer hardware Operating system Spacetime Absolute value Mathematical optimization Euklidischer Ring Multiplication sign Chemical equation Plastikkarte Planning Database Volume (thermodynamics) Cartesian coordinate system Limit (category theory) System call Database normalization Resource allocation Computer animation Software Point cloud
Point (geometry) Mobile app Server (computing) Backup Multiplication sign Time zone Random access Price index Limit (category theory) Instance (computer science) Mereology Replication (computing) Number Goodness of fit Computer configuration Operator (mathematics) Computer hardware Analytic continuation Loop (music) Web page Basis <Mathematik> Line (geometry) Instance (computer science) Limit (category theory) Symbol table Replication (computing) Degree (graph theory) Subject indexing Database normalization Computer animation Software Personal digital assistant Point cloud Right angle Quicksort Row (database)
Group action Cellular automaton Software developer Projective plane Time zone Data storage device Core dump Database Cloud computing Instance (computer science) Replication (computing) Computer animation Different (Kate Ryan album)
Presentation of a group Multiplication sign Surface Data storage device 3 (number) Cloud computing Instance (computer science) Instance (computer science) Client (computing) Computer Flow separation Benchmark Revision control Computer animation Term (mathematics) File archiver
Ocean current Mathematics Roundness (object) Computer animation Block (periodic table) Computer configuration Multiplication sign Analogy Internet service provider Data storage device Set (mathematics) Instance (computer science)
Axiom of choice Group action Server (computing) Building Service (economics) State of matter Confidence interval Multiplication sign Ultraviolet photoelectron spectroscopy Database Water vapor Instance (computer science) Revision control Data management Bit rate Computer configuration Extension (kinesiology) Information security Backup Physical system Service (economics) Relational database Database Basis <Mathematik> Instance (computer science) Benchmark Data management Process (computing) Computer configuration Computer animation Universe (mathematics) Revision control Quicksort Information security Physical system Extension (kinesiology)
Group action System administrator Multiplication sign Time zone ACID Cloud computing Database Client (computing) Replication (computing) Mereology Graph coloring Revision control Term (mathematics) Synchronization Automation Extension (kinesiology) Multiplication Computing platform Data type Time zone Database Replication (computing) Database normalization Computer animation Extension (kinesiology)
Point (geometry) Service (economics) Real number System administrator Software developer Cloud computing High availability Database Database Mereology Replication (computing) Flow separation Replication (computing) Computer animation Radio-frequency identification Computer configuration Revision control Point cloud Quicksort Extension (kinesiology) Data type Extension (kinesiology)
Point (geometry) Standard deviation Email System call View (database) System administrator Combinational logic Online help Instance (computer science) Bit rate Replication (computing) Data management Population density Cache (computing) Bit rate Single-precision floating-point format Moving average Energy level Form (programming) World Wide Web Consortium Broadcast programming View (database) Online help Software developer Database Grand Unified Theory Instance (computer science) Group action Benchmark Computer animation Quicksort
Standard deviation Standard deviation Service (economics) High availability Database Database Cloud computing Instance (computer science) Computer animation Computer configuration Limit set Energy level Energy level Computing platform
Server (computing) Service (economics) System administrator View (database) Cloud computing Branch (computer science) Instance (computer science) Regular graph Computer programming Latent heat Hybrid computer Endliche Modelltheorie Computing platform Data type Point cloud Online help Cloud computing Instance (computer science) Cartesian coordinate system System call Computer animation Software Revision control Computing platform Point cloud Energy level Quicksort Task (computing) Spacetime Extension (kinesiology)
Block (periodic table) Graph (mathematics) Data storage device Virtual machine 1 (number) Instance (computer science) Data storage device Instance (computer science) Benchmark Mathematics Computer animation Computer configuration Point cloud Software testing Quicksort Block (periodic table) Address space
Distribution (mathematics) Service (economics) INTEGRAL Disintegration Cloud computing Core dump Database Instance (computer science) Computer animation Computer configuration Googol Revision control Computing platform Computing platform Data type Extension (kinesiology)
Service (economics) Electronic program guide Cloud computing Data storage device Instance (computer science) Client (computing) Term (mathematics) Computer configuration Vertex (graph theory) Backup Computing platform Data type Point cloud Vulnerability (computing) Block (periodic table) Software developer Digitizing Data storage device Independence (probability theory) Cloud computing Instance (computer science) Computer animation Software Revision control Computing platform Block (periodic table) Extension (kinesiology)
Group action Multiplication sign 1 (number) Shared memory Cloud computing Cloud computing Benchmark Workload Latent heat Benchmark Workload Computer animation Software testing Quicksort Freeware
Slide rule Random number Randomization Presentation of a group Group action Building Structural load Chemical equation Multiplication sign Database Water vapor Database transaction Event horizon Writing Read-only memory Term (mathematics) Computer configuration Semiconductor memory Different (Kate Ryan album) Single-precision floating-point format Operator (mathematics) Query language Software testing MiniDisc Computing platform Tunis Pairwise comparison Chemical equation Structural load Weight Database Database transaction Instance (computer science) Variable (mathematics) Measurement Subject indexing Befehlsprozessor Film editing Computer animation Website Right angle Key (cryptography) Quicksort Reading (process) Resultant Writing Row (database)
Metropolitan area network Time zone Service (economics) 1 (number) Instance (computer science) Instance (computer science) Event horizon Computer animation Software Personal digital assistant Point cloud Condition number Software testing Computing platform Condition number
Randomization Service (economics) Divisor Multiplication sign Range (statistics) Virtual machine Maxima and minima Median Set (mathematics) Instance (computer science) Mereology Number Inference Partial differential equation Workload Response time (technology) Different (Kate Ryan album) Computing platform Multiplication sign Forcing (mathematics) Structural load Data storage device Database Median Maxima and minima Instance (computer science) Variable (mathematics) Shooting method Computer animation Software Quicksort
Pairwise comparison Multiplication sign Graph (mathematics) Data storage device Computer simulation Cloud computing Database Water vapor Instance (computer science) Instance (computer science) Variable (mathematics) Word Computer configuration Computer animation Computer configuration Different (Kate Ryan album) Software testing Software testing Quicksort Arithmetic progression Flux Computing platform Identity management
Slide rule Backup Service (economics) Overhead (computing) Structural load Multiplication sign Median Set (mathematics) Insertion loss Mereology Run-time system Read-only memory Profil (magazine) Computer configuration Semiconductor memory Moving average Software testing Resource allocation Social class Arm Structural load Data storage device Thermal expansion Bit Database Instance (computer science) Measurement Computer animation File archiver Video game
Reading (process) Asynchronous Transfer Mode Structural load Web page Virtual machine Data storage device 1 (number) Real-time operating system Database Mereology Benchmark Event horizon Writing Computer animation Software Read-only memory Multi-agent system Raster graphics Different (Kate Ryan album) Term (mathematics) MiniDisc Social class
Slide rule Structural load Structural load Multiplication sign Median Sound effect Instance (computer science) Computer animation Read-only memory Semiconductor memory Spacetime Software testing Computer engineering Quicksort MiniDisc
Multiplication sign Virtual machine Median Similarity (geometry) Replication (computing) Strategy game Read-only memory Different (Kate Ryan album) Computer configuration Semiconductor memory Cuboid Software testing Circle Data conversion Monster group Time zone Pairwise comparison Arm Sigma-algebra Block (periodic table) Digitizing Data storage device Digital signal Instance (computer science) Variable (mathematics) Benchmark Database normalization Computer animation Software Read-only memory Normal (geometry) MiniDisc Quicksort Writing Reading (process)
Multiplication sign Virtual machine Motion capture Median Set (mathematics) Mathematics Spherical cap Read-only memory Computer configuration Different (Kate Ryan album) Semiconductor memory Ideal (ethics) Spacetime Software testing MiniDisc Tunis Social class Median Cloud computing Instance (computer science) Benchmark Computer animation Internet service provider Resultant Writing
Median Instance (computer science) Drop (liquid) Extreme programming Limit (category theory) Variable (mathematics) Revision control Type theory Computer animation Software Read-only memory Profil (magazine) Semiconductor memory Different (Kate Ryan album) Software testing Quicksort MiniDisc Partition (number theory)
Asynchronous Transfer Mode Multiplication sign Median Set (mathematics) Rekursiv aufzählbare Menge Database normalization Casting (performing arts) Sample (statistics) Computer animation Read-only memory Semiconductor memory Buffer solution MiniDisc Spacetime Right angle Quicksort Office suite Asynchronous Transfer Mode
Service (economics) Structural load Repetition Structural load Median Bit Database Database transaction Instance (computer science) Limit (category theory) Mereology Revision control Cache (computing) Mathematics Computer animation Software Computer configuration Different (Kate Ryan album) Term (mathematics) File system Table (information) Scalable Coherent Interface
Repetition Structural load Virtual machine Median Bit Instance (computer science) Process (computing) Computer animation Read-only memory Semiconductor memory Different (Kate Ryan album) Computer configuration Term (mathematics) Set (mathematics) Writing Reading (process) Asynchronous Transfer Mode Scalable Coherent Interface
Multiplication sign Dependent and independent variables Variety (linguistics) Graph (mathematics) Multiplication sign Graph (mathematics) Projective plane Data storage device Database Database transaction Benchmark Event horizon Computer programming Benchmark Computer animation Visualization (computer graphics) Term (mathematics) Oval Graph (mathematics) Software testing Quicksort
Service (economics) Computer-generated imagery Virtual machine ACID Database Public key certificate Product (business) Goodness of fit Bit rate Computer configuration Computer hardware Reduction of order Category of being Social class Block (periodic table) Data storage device Coma Berenices Bit Median Instance (computer science) Variable (mathematics) Cartesian coordinate system Benchmark Computer animation Commitment scheme Point cloud Right angle Quicksort
the welcome to shoot out at the Platform-as-a-Service and this was the brand for that would be ugly the and that i i if I'm going to happen here about followed by a post this performance on public clouds in using different public clouds from and comparing them so if you thought was going be something else but you still have time to get to know the talk about otherwise it and so I wanna things that I has been going on in the world of posting stuff is that it's a bit wild west right now if you hadn't noticed there's dozens of different providers provide very different things on some of the stuff is sometimes expensive and sometimes not on a service providers can go away overnight on and the engage in vicious competitive sales wars of trying to undercut each other so it's very wild west and the thing and the other and so as a result on in my consulting practice clients come to us and ask us to make recommendations both on and make recommendations on how they should be hosted on on the cloud of or whether they should be hosted on the cloud not really address the 2nd question here but I the 1st and then your 1st so you know how it should be in the cloud which cloud how it should be configured on and so this certainly down the road of doing some benchmarking arm against running post was on various public lots of time and that's what this it's for some new
couple thinks of building kick this off was meeting Rubin Ruby array of manager cloud that common but they actually supply a unified API for deploying cloud servers 67 different public clouds and so they get a bunch of the benchmarking in here so that I didn't have to write all of the different Cloud API as because they were done the um also
I think roku in AWS to given me some free content so far on my may have some frequent time later coming from other providers i in terms of getting the stock so and we
have are magnificent Seven right here on in terms of what I tested so far and Rackspace RTS digital ocean easy to Gould Computer general group and then 1 mystery person and who I will be going over at the end of talk and so start
out with the area or we have where I the most familiar in which it is Amazon Web Services is what we do most of our close the points are in some variant of aims and web services but it was was various services is such a large ecosystem that it includes all the various providers who built on top of the of infrastructure and so the 3 options when recovering here are the other are going and which is the roll-your-own option on to other rancher which is RDS in the D. N. B which is OK but there is another
option drops in for POS grows on CQ which is the B has their own Cloud management thing on top of the US I have not tested it yet not limit free time that's all I can say the even
future aren't so I because they all 8 abuse substance they have some things in common I 1 is a comprehensive API but that is not necessarily easy-to-learn easy-to-use but it does cover absolutely everything anything you can do on AWS or its subsidiary services can be done through a web services API on and in some cases certain things can only be done through the web services API by the UN and that's not necessarily true with all the other clouds and with some of the other cloud you have to do certain things through their gooey tools several which is a little irritating to manage a lot of servers on the other thing is that they have as you get the largest available global distribution of all but 1 of the rest of clouds and of with you know once in different sections this is true regardless of which of the AWS based platforms used their thing
you get is lots and lots of extra stuff that you can use to gather with your AWS hosted posters instances if it's useful to you so things like I S 3 long-term storage or other weird you know things or Elastic Beanstalk right it was a while ago I want say about now is the Inland container service because a lot about herself and the other a bunch of other things and cashing in all of these other things that they have packaged as rental services for you to use that can be used very easily with anything until today we asked whether it's easy to overlook or whatever so in
some ways he did this was you know the other amazing about this is this was the only option 2 years ago right right you do to to the wider world more before Morocco launched this was really serve the only option right if you're hosting of what goes on public cloud the new building your post postmaturity seeking is still a major option with a
lot of so basically here the idea you gunslinger thing right which he is you create an instance you install post on it you can figure that post was well and know your because that's pretty much it and now
have this is sort of our little comparison she between the various our are sort of character sheet for the guns somewhere right and so this is Platform as a Service as both the database as a service to going later on I'll administration is doing yourself a archival abilities to yourself you versions or whatever you want to install you want installment would 5 developed go for and extensions anything you want to install again on extra features really nothing other than what the iterative framework provides on and the price is relatively cheap compared to a lot of other options so now let's talk a little bit
about ATBs with a lot of experience AWS because we're on so much stuff on it on Wednesday Amazon does is give you a heck of a lot of different options in terms of different sort of cloud sizes and that sort of thing and be a little harder negotiate those are denoted changes all the time is and look up in the slide this morning on because the instances there available had changed in the last couple weeks and by stimulating the narrow down in terms of we deploy on off a smaller databases we often deployed in the and in 3 general instance general-purpose that M is a general-purpose class with a sort of equal balance of CPU RAM and I O and network I I inferior if you had a database that was extremely CPU-intensive but not particularly large it would make sense to put it on 1 of the series computer optimized ones in practice i've never done that and is never come across that particular are weighted using most time is are are the are a series which maximizes the amount of memory you have available because caching preferably your entire database and Randall talk about that in a minute is really important when running a public cloud because I only sister so very much like on and then they have now a couple of different storage optimize things I for fast the optimized storage the full work really massive by magnetic volumes and for people who have really really large seeded databases and for data warehouses and although by and large data warehouse public clouds are not necessarily the best choice for data warehousing applications because of so benefits 5 you know what to use on the data so that they go hedonism series and in the important thing is all storage on public clouds use shared network storage of some kind and i . different options ABS talk to you about that but it is it is all effectively shared storage and as a result the stored latencies you're looking at if you were used to your own hardware with SSD is or a hard drives a red card inside the machine the short latencies are going to be an order of magnitude larger than the act in reality the out in terms of of doing individual Davis writes and and so the difference between the difference between your data the data you looking for is in RAM and the data looking for is and this is a much larger difference that it is if you're running on her depends region doing the and so on so it's this becomes much this becomes more important than it is here to here are and so at
couple of this inferred when Amazon and the classic for recommended for databases with the call provision helps on and this is where AWS guarantees that you will get a certain number of our rights or reads per 2nd and the other by this is available you know in whatever sigh volume size you want to define up to 16 terabytes I think is all the for EDS seen you know whatever point that you want to find out there's a bunch of other users of Amazon's Elastic Block Store network thing they have of features and particularly useful for data service is your ability to create a coherent snapshot of the volume which can be used as part of the backup data recovery strategy on the other and better these days and and what do increasingly about frankly for cost reasons is that on for the general purpose which used to be only useful for low performance very bursty loads on less is known this gpq then they've been guaranteeing a certain number of biopsy based on the size of the GP volume and so suddenly it makes sense to allocated to terabyte GP to volume on the end of and get 6 thousand I outside of that instead of getting practices that were provision that acts on and that's what we're doing a new instances that we deploy some of the cost of the no Monday and does have what's called instance storage from which is local storage very close to the instance is much lower latency of then EVS on but it's also risky as the you'd better have a more sophisticated Dr plan involving multiple replicas in and out and back up to S 3 continuous-backup days cetera doing this because the instance storage can go away and it can go way not only the instance goes away didn't go away if there's a restart fermions reasons and a bunch of other things can happen to instance storage it's not in any way guarantee and but you know if you've got some of these issues were latency is a major issue you willing to deal with the extra redundancy can be an option for you I did not benchmark listening events which about to see while in 1 of the lecture but in most of them not all of yeah no 1 thing to understand about this and all the storage ratings on on a actually the other public housing general are measured in terms of buyouts as important understand that I and throughput are not the same thing I axis how many operations can you do per 2nd up to a fairly limited size of operation depending on what storage and the and the result can be
on the couple and 1st of all apps on it's not just guarantee it's also a limit on I don't know what the dataset guys duty engineers rests on but I've been fairly impressed at how they can stay within 10 % of the target I have plus or minus like a really consistent basis so don't think that if you're getting 5 thousand apps that means 5 5 thousand or more no it needs your within your 14 150 100 all the time and the and but the other thing is if you're doing operations that involve a heavy degree of random access they to access is going to be on the classic example this is an index look up with the nest with nestlings join in PostgreSQL where you're querying index repeatedly every so in that case every single row becomes its own ion of have every single row becomes its own line up and then your i operate is the number of rows per 2nd you can read well a thousand that sounds like a lot but who they gave us a thousand was the 2nd is not a lot that's a pretty slow part of so keep that in mind when sort of allocating some of these
other stuff you wanna set up we are talking about public cloud of instances are a symbol of you know they can go away they can be redefined the they have all kinds of problems redundancies all-important so if you're native US you want do will lead us 3 very different public cloud you and the use of other redundancy the option the island of replication to a 2nd server is really not up something optional you know is in every point that we set up on on a public cloud has both continuous backup in replication to at least 1 instance because I you both need to be able to fill the fasting to be able to recover from more complicated disasters and that's a good idea in general but sort of a communion on hardware even kind of put off 1 of those 2 things and say OK what got back up going so I'm not gonna do replication right now or that replication listening to back up right now next quarter right you should be doing it from you know 5 minutes after you the instance of Europe publicly on on the on monitoring tool for instance the should seem obvious but apparently it's not and the other and you are sharing the network with lots and lots of other people so SSL again not use SSL for everything lockdown UPG speed outcome so that only people within Europe Amazon VPC can connect to your servers on the very security conscious because you are sharing a hardware with everybody else and and
you do not want to be the next news items inside the cell and he was basic
set up for a lot of that so started out with inserted easy to based sizing thing because I was trying to provide a group of sizes that both I could afford a which benchmarks on and off and that I could kind of matchup between the different clouds of the couple cabinets and so this is sort of small 1 where sort of cheap 1 off database as the and you're just getting started it's a new development project whatever there's no 1 me and on it and so this is on Amazon's the M 3 medium 1 quarter about 4 gigabytes of RAM we did 48 bytes of storage with a thousand perhaps I said all this search your belly sees provision that's my because I was still comparing the difference between gp to provision absent in the real performance so I want to go with what I what I already trusted the via now the larger
instance that we tested on our 3 double extra-large which is a course and about 60 gigabytes of RAM I to III greater storage of 4 thousand provision died on the i for for this and that's really actually more of a medium in terms of what's available but at a larger size protest and then everything else is kind comparible to that and now
is the problem about pricing is surface you presence presentation I previously given other versions of this talk that have concentrated more on cost performance because that was what our clients want to know when I started doing this the problem here is that since I 1st put together these benchmarks in February costs for several public clouds have changed multiple times so I've taken most the cost of understanding even discussing presented you now are probably already role on but it's kind of computation eventually you so we're doing abundant gunslinger in terms of the way that we set it up again going instance with 30 to 50 per month we've got provision IOCs another hundred dollars a month I'll were archiving ministry archive which is the cheap because we're not using very much storage on the and then of course we have a replica that has the same set up on this so that comes to about 200 medium of course in this charges with you know what I trained for an outlook of of I'm not said this
change all the time I mean for example on the if I do another round of this I'm going to be
using GP to which decreases are block storage cost on an accident that now if you were really trying to do this on the cheap you really could do this analogy because you GP to you could decide not to have a replica rely entirely in continuous back up on now actually bring it down to about 75 bucks a month plus miscellaneous text I was the set
of gunslinger this is the the way I had configured at the time and again if you actually reconfigured this with some the current pricing and GP to the dual cheaper about it's also offers an option I believe some of the other providers do as well that you can actually pay for a large black and the instance like prepay for a year of usage on and then the price per day you know the price per month gets a lot cheaper and if you do that and these are these are the on-demand prices that I'm representing here on and that's true for all provide some writers don't have non demand prices of you I said that was our
confidence in configuration rate investment writing benchmarks aren't so now are more water state group is the rancher the relational database servers and the main
reason is the services I am like in yes but I don't want always instance management this is this is too complicated that sort of thing I actually want is use it was I haven't deployment for so
this is what's known as databases service you're not dealing with operating system you're not build system configuration you just dealing with the Cloud API plus what port 5 4 3 2 on the and and that is your will interface to the data and
I also call it as EDTA somebody else's that EPA and are they take care about time back ups configuration I PostgreSQL updates on is a really good option if you don't actually have any full-time jobs people in the company and for whatever reason you don't wanna hire a company to manage 40 the and so on and there are some downsides this stuff on you don't get all of your choices of everything in the post was universe you might ever wanted right I only certain versions are available only certain post was extensions are available I you can't do weird sort of offbeat configuration things that that you know on a regular several require command-line access on security the what countries that you can use this is limited because the data service posts are not allowed to touch PGA so what if it's impossible but more because you are paying someone else to be you DBA even on a percentage basis distributed across a lot of instances
said here is again a little characters you care about qualitative database on administration is mostly automatic on there are some things that you still want to yourself on post was that color is available in you often want to tweak a couple things in there and the and depending on your ETA set up it might or might not be fully automatic but for RTs is often they availability zone redundancy on and that is automated and it's part of the platform on for redundancy the it has some performance drawbacks so even when using RDF some of our clients actually set up regular replication and have lost the the redundancy for yeah the uh and so are versions available currently a 93 and 94 and I was about 2 dozen extensions is so that are available on the on any particular extra features for PostgreSQL acidic on and the price is small so that a
lot of Moldavia of so this is basically Amazon's own internal synchronous block-based filed something like that 5 replications not pose with replication it's their own storage replication to another node with automated failover and I ended up time guarantees i in terms of availability of course great there are some major performance costs associated with the motion and now only this action
is already in and that's a real clue but broccoli is
I just wanna developed other cloud should handle 100 % administration on gray option if you are a solar developer of your part of a shop that has a lot developers and and really absolutely no obstacle the other
and so this is again databases Service administration is fully automatic in fact if you want to get involved in administration really couldn't i v i and their high availability thing is replication plus point and poverty that they manage on in guaranteeing restoring that sort of thing through your various options further development 3 9 they tend to put up with posters because from among the cloud for people the use of before the available again around 2 dozen extensions and their several extra features or mention pricing is relatively high and this because they're providing the most extra stuff and the most sort of complete you never administer or even think about administering the database options and and wearing by
extra suffered by suitably here it is a 1st of all if you got a big database based development shop that's what their workflow years is designed around the is a combination get and rate which is really to to to to manage stuff to deploy instances I rather than and other forms of API on so if like that workloads great if you don't like that workflow it's a lot and and the other nice thing that they haven't seen from any of this thing called are ideal hopes and these are basically http accessible materialized views that you can set up through the API and then make available to you your customers are your other users or whatever and and they really simplified replication to be comprehensible to everybody on your team by having this concept recall followers and making it sort of point click or single API command to set up the followers
and on the big feature for this so I was benchmarking roku when I set up a large instance the benchmark on and in less than 24 hours I got this e-mail and and I checked the person let's e-mailed me this had no contact with the density levels was the benchmark and so this is something that they that they normally which is that the Administration goes beyond just the automated administration to the point where if you were a larger customer they have help and advice available and now they
actually have a much more limited set of options this 5 database sizes 3 levels of high availability and that's it you know matrix 15 options those Xerox and so that eyes we
used on a small which is pretty much people in the in the medium all it's not actually in in the medium and large standard 6 which is 60 gigabytes of RAM a course like like the other 1 and now you
all other clouds getting off of the to be less than moving into the wider world of i platform in databases service mostly platforms so Rackspace are businessman yeah so the main
reason to use Rackspace Cloud as far as I can tell it I have a lot of servers and the Rackspace no 1 branch out into cloud-based on the on the you have a public cloud and that sort of thing
it's Platform as a Service again by the administration is yourself + Rackspace's support on a given my direct is no specific support for post was well so view of a problem Linux support is really good if your problem with post-growth you're on your own VI and I Hyderabad is going to yourself on because you're installing everything itself everything is available on the extra I mean extra is that Rackspace has regular rental servers and cloud servers available on the same network in the same dataset and so you can do this for call hybrid cloud things we have some things that our our cloud deployed service and some things that are regular servers and the sort of mixed even in the same application and pricing is sort of model for us but now by the way with actually supported the spherical supporting living interesting pricing the in the cloud because signing up for their extra support is actually not optional on meeting the actual your 1st cloud space costs a lot because you have to sign a support program but then it's not incremental with additional instances and the
and so I 1 of the things he is Rackspace's sort of block storage option is we're not primarily their machines is instance storage it's always the benchmarks block storage is only available with the larger cloud instances is not available at all the smaller ones and released in in early March wherein these that was true but I haven't checked to see that change and and on top of which it's kind of weirdly not a unified API for the block storage so we ended up doing all of test instance 1st and could make it work otherwise and so
on this our sizing now 1 of things ready to deviate from his on the Small instance Rackspace didn't offer anything with 1 core on in so you will see higher performance on the smallest instrument performance graphs because we have 4 available which is more than any other cloud at that but in his address to the
I call the drifter not because school this it after but because they seem to be aiming at putting everybody else's customers those 2 items and and this
basically says always acq like platform only much cheaper you know what the global distribution I want a lot of the option that will be up and you're on and I wanted for less
money this is a platform as a service by everything is do yourself I install yourself on the only extra really is that Google's the performed to offer integration with dopamine as containers for any of you were doing that in thing on everything thing is that we propose rests on the price is very cheap but this is the instance size that we chose a fairly analogous to the database and the size of available so now here's
the kid by digital so from a complicated because it seems to be aimed mostly at independent developers in terms of their marketing and by this this this is to cheap simple and fast and nothing else and so it's platforms
service to yourself everything like that no extra only guide to this is by far the cheapest option in public clouds in terms of of surveillance in size from money and that is a problem with
the kids in the content Chart up a lot i the other so idealist
and does not have any kind of a network block store option on it's all instance storage and nothing is durable and there is no long-term redundant storage for going back up so like we've got a client and digital ocean but we're still backing up the Amazon S 3 because it often does have an option for them on no features of any income and so that is the size we pick freely moving pretty good at getting close to analogous to to the other on inside there are more clouds out there
and then I had the thing OpenShift Joyent Europe Cloud Foundry on given lots of free time and lots of funding I would test all of these on in reality will see how it works out of v and that because they're all interesting on for example Microsoft I think is added a specific sort of post support action and 2nd when a test that but have enabled so as to
show out with the the ones that we have we already got 6 of them right so now what is the share
the and now venture has some
mentorship oppose grows you don't solves both with into package on its own micro benchmark with very simple sort of bakery workload really fast for setup and teardown relative to other benchmark I drawback is
it's really limited in what portions of posters present tests and what portions of the Platform Tests doesn't do couple stories purely random data access and unrealistic balance of work to really reliant on single row right speed and in terms of its balanced not great tunable I would say not tunable all on
and what are the instances I got I did some testing against in 1 untuned instance of people posters that kind verses what I regard tune posters that comes up any see that the tremendous difference in performance between the 2 options and so that the water from PD events is because of it's sort of purely random single relaxes workload it's really not responsive to pose with tuning in in most weights and all of the other I thought at least increasing checkpoint segments would make a significant difference but they did not but it is not the and that is given for instance about half a per cent so and he was the size and and I 3 different size infopedia so I and then we read white so 50 % of main RAM our right transactions memory read only 50 % of right we only and then this right where the database resource between 150 and 300 % of Raman size on and doing read write transaction and so the value of his later on the slides this is just if anybody's sites that were we create my results is the PG that commands the doing of instances that were the cut so that we can actually get to let of that are most people know you know the transactions per 2nd for the event with your overall throughput for the wrong I and this measures multiple things depending on which configuration you're running you know right speed excetera conventional ways of building the action decided measure partly because of who originally requested summons comparison was that you actually look at the initial load time and PDB you see how fast it is how fast the instance used to do it badly engineered load as embedded you doing what's a single row right but frankly a lot of customers are doing that's how the building that's how they are doing the details to do a lot of single releases so it's nice to know the speed is the almond was interesting because there actually is a fair amount of variability in how long that initially they spilled takes also index building time which then gives you sort of large memory operations CPU set
other conditions I'm always has redundant posters 9 . 3 year 9 3 5 9 3 6 depending on both in and out of which probably wrong but unfortunately we were unable to do the same or less and all of its on there were there was a least 1 cloud where Sancho 7 didn't 1 of the and for reasons we didn't really understand so this is day benefits 2 1404 center 7 1 and also these are 2 instance has yeah have also be a boon to 1404 of these for the Platform as a Service ones on ballot for data service we don't know what a West around but so I have a 4 Platform-as-a-Service event 1404 would've been 3 . 1 3 I might have to check in with the centers would yeah so so whatever 707 would have been in February or March the UN and also by the user to instance tests that were not willing to do that in the same instance suppose was running on its running another instance of the the same size so we are actually testing network I O here as well as there a lot of cases that's mainly what we test
yes within the same zone
except that some databases service platforms don't tell us with were running and therefore we have to get and then run it many many
many many many times those entire store and why do we have to work any canceled of explain that
into your boxplot lighting of this kind about what this kind of
boxplot and this is actually the range of scores you know sorted for of whatever size runs I think this might in the RDS large run of much but the actual minerals force but you can see a there a pretty substantial amount of variability between individual runs in each run with by the way created new set of instances do the performance on some instances that green sediments is to perform inference substances that during different instances each part of the island and you actually see that compared with the minimum score and the maximum score that's actually look a factor of 10 difference and now some of that is the sort of randomness of the PDE that workload but a lot of that is the randomness of the actual instance capabilities should get the machines are not always the same the number of other people who are doing busy things on the same physical machine is definitely not the same as as the number of other people in the network in a busy there is not the same all these things affect your actual your real throughput on and for that we use in our this course and would prevent for benchmarking for the load time where slow where where I slower is it were larger worse I'm going to give you the median score and the 90 % for and for PPs where you want more I'm going to be a score in the 10 % because generally database performance this will looking for forward our target is you know for response times is 90 per cent return within acts and so that's we're sort of shooting for so when the smoke clears
time for a whole bunch of graphs now some caveats here and 1st of all I'm intuition compatibility problems between clouds were not measuring the same things with Rackspace you'll simulated instance storage and only the database as a platform options have fully automated HA so it's not really the same thing on instance either identical instance OS is are not identical to the lot of variability in here outside of the test or also prices instances different things that are available have changed over the last 3 months on and will continue to change from 1 of these RT mention 2nd half of the thing is in previous words and that's our that's cost comparisons and discovered that pricing for the public clouds is way too much in flux from you including many of us because it's generally out of date even I updated we report conference do we have data time presented better not they're all
and so although this is a work in progress and and and water think about sort of how you
measure not only have 1 class slide just to show you 1 thing that's actually going on that has not quite changed on which is what I was presenting a whole set of course slides want people notice here is that we've got a goal individual ocean are substantially different in pricing profiles from all the other options that we have in hand and arm I believe in people I talk to the club is believed that this is because they are currently providing services below cost as part of a major expansion and which means that if if you are cost-sensitive their reactions right now but not might not always be I now so it's actually look
at some think so here's a small in memory instance and we're going to look at load time here and so how long did it take to load that initial small database in memory data is about digging happens on the and obviously sorted better right on in so we notice a couple of things 1 is the Georgia Rackspace life after while will cause the museums in storage and we're doing individual insert so latency matters of in because the loading and so is there a little bit faster so you getting faster performance there at the cost of reliability now the other thing you notice here is are the RDS is significantly slower for them and so the 1st thing I did was solace was aII Milgram and like what the hell is going on here and so there's a couple things that are going on here but when he is that RTs if you doing automated backups which I strongly recommend that means they turn on archiving from the get go and the archiving is done on the same using the same i apps allocation that is available for the database in general whereas when I said I'm only see 2 instance I am using a different channel to do archiving so it doesn't come out of my delegation on so it's not exactly compare the other thing is great pointed out that they had checksums turn on the air and for that reason there was some overhead and that's I was like OK 1 of the recently between laughter thought and I was there OK let's actually under relatively stable instance with you by going on a fishing expedition where I keep starting and stopping instances until I find 1 that performs above what I would expect for that that particular profile and does it consistently and that generally means that I've managed to grab the 1st instance on a machine on and then I can actually do set of stable tests
on and so I did this on machine and it turns out that this is what a large database that there actually was depending on the size of the database 8 12 to 25 per cent difference in real time depending on whether you had checksums turned off and so that is part of that difference on it because I would recommend turning on checksums general now when the visual hate it has much of a difference will talk of what's the difference like a throughput what events in and it
turns out the events and throughput was much smaller answers 3 different benchmarks that I did you know in terms of of house you know what was there and which what's and so for the throughput for general it for global only talking about 2 to 3 per cent impact on a so this says to me is you know hate checksums prevent a certain class of database corruption on which is really important when you're network storage and we're only government to 3 % in impact unless would mainly consist of ETL I'm so just turn them off on this does mean you initializing database for new ones so in a way it sets so what's
allelotype so here's our load time for small and this is the on this larger than memory I'm not really different from overseeing
before I Larger memory load time all a sudden i've goal Compute Engine sort of ramped up here and what we discover that was because we didn't really understand how GC allocated up so I would like to go in and we do these test because I 5 because now we do understand it basically we're getting instances that was our capped at 500 i helps on in that really effects in the other tests that we did but
I would say this slide was was to go
so was skip over because of the price cost conversion was the norm the um so now I have to look at some performance benchmarks transaction-per-second small and then we read write on smaller is better so we've got some differences you're no again I I see
whether or not circle so for digital ocean Rackspace with that sort of instance storage and advantage of in comparison so again performance of the cost of reliable i v i now for the rest of it these are pretty similar arm RTS DCC to Abu that couple the differences here are Morocco is actually in the Small instance really outperforming other small instances and going back and forth and asking recruiting some questions I forget about this this this monsters are actually larger Amazon instances that they have subdivide so there is a tremendous opportunity when I would have to repeatedly make use of to get extra performance by being a bad neighbor on by using a lot more resources in machine that I was necessarily entitled to and so if that's your strategy then actually you know the broker-dealer performance option for the other thing was in my tests and n-gram argues with me about this but in my tests on having all their availability zone turned on and on for the read write tests resulted in something like a 30 to 40 % throughput dropped and basically because small Daisy redundancy is synchronous replication and sigma replication always adds a substantial amount of network latency to write on the and for that reason affects your throughput know yet what the the yeah now doing the small memory read only we actually see some different things i these Rackspace's higher this 1 because we've got more cost saving digital should 1 extra course on the has a big effect on small we only on these activities similar we get some extra variability Roekel again because the bad neighbor problem but and this actually isn't he was a weird thing the grant has not so far been able to explain an idea extra test runs because of that for some bizarre reason I was doing slightly higher throughput in the read-only test with availability zone turned off and I never figure out why this was and it's only slightly higher but it showed up over what the cost of 11 or 12 test so it's not you know statistically insignificant aren't by running low on time so let's actually through some of the resident works but so small and disk read read write the here and now furious seeing again instance storage kind of dominates verses block storage boxes tends to equal out performance over the and actually here
all over the whole set
up now now at large memory since there large instance any syllable origins of performance is actually fairly different from the Small instance perform on because the different clouds actually treat very different classes of machines very different from the smallest was actually were consistent enlarges and so on and so the in-memory read write and I was actually kind of because they know how the designed a little surprise 1 this was actually taken on by the way I've been working with the club providers when no 1 else so I've been e-mailing grants they've been you know changing separate only supplemented in the benchmark ideal the rope you guys because initial benchmark was way below this and they actually based on my results made some changes to the configuration of the heroically instances but not just for me but for everybody on in we tested and performance got much better off this and so here were saying rope would RTS better and partly I think because of tuning by those suspected teams by the came in the in the RTS schema of the of the IGC were hitting our 500 cap Rackspace this is the 1st time the also then I have captures up which it did not the smallest so I really don't know what's going on with Rackspace story but but this was actually a the notice that the 90 % and the median are almost the same while in the whole test from was almost flat which meant I was really being limited by storage on inside it really understand how Rackspace Cloud Storage works and now
add this is the read-only in memory of which is a very different profiles you notice I this is dominated partly by again Rackspace but we have some extra cost because they favor CPU over other resources and I got really a lot more variability on the RTS tests on this 1 then idea the other types of instances I don't know if that was the brain and look at the drop if I go back for another testing 1 of see if that shows up on otherwise fairly similar because here were largely heating post-stressed performance limits on outlawed network performance limits in the networks are not that different between the different instances of are compared to other kinds of large
and desk and I get here you can see this is were being 1 greatest performance because the 5 derived partition we couldn't actually make the test complete on on based on a on an RTS here this is the sort of extreme version of the difference between Moldavia Louise only and and regular so
didn't like the leader so I
was industry yes and we have any time of the questions so here is a mystery guest who was it was the 7th guy will this is the
gambling and this is what I call Running with Scissors mode and because of discussion we got into online about that sort of thing about could put could be run in memory if you you didn't care about redundancy reliability which began only
undecidable winter settings and to basically eliminate is much disk accesses I possibly could from post this right so no BG writer is little wall as possible turn off office encrusting commit full-page rights increase while buffer size increase shared buffer size so that puts present cast as much as possible on the end and not touch the your thoughts together for the always might decide what from its own I
am so understandable some changes here and so this is a the small instances so 1 of 4 things I tried with could have configuration 1 is imagining that you're doing a Running with Scissors database but it was a replica and that's the reason why you willing to write it like this on the 2nd was if it was a purely ephemeral service we could also make the tables unlocked so that they were wanting to get well 1 of these is that actually the unwanted version at least in the small instance is not significantly faster for load which is a bit of a surprise to me turns out fear stake the wall just writing the file system cache on at least a small instance doesn't really make that much of a difference of so but you can see loading is way fast again here shorter is that as you would expect in a ephemeral
instance aren't now TPS did not actually improve on the small instances much as I would expect what part of it is that I didn't want to test from for this and it's pretty much human to be around 600 in change and I think what's happening is I'm hitting a limit either on the amount of transactions I can pump out of PG bent on a small instance or more likely hitting network I at in in terms of that because I was checking the post was instance on the database and that was not out of resources and I now
I in memory read-only you would expect this to be a lot questions that could be a little bit faster on there was a little bit faster
and the and now the largest also by the Museum of convergence in the mean difference but I did scissors you know here and I know the larger instances it was much faster and partly because I think I wasn't actually running out of resources on the PGA bench machine on the way that I was for the small and so on and you can see it's to 300 faster and in the running of scissors mode for read write
on and for read only a little bit faster but i'd in those terms so an option if you have lots of replicas that you can afford replace and you've got lots of load coming in is actually even even if the only processing only look you will get used by doing the morning with is
so have few of the things that I
would like to do more tests are migrating to benchmark OPE adjacent bench to benchmark to some databases on I'd like to also DVD store which is a little more complicated benchmark transactional benchmark and PG events on so that I would be testing just a couple axes of performance but a lot more variety of stuff on test or whatever the clouds and
and I and I really like somebody else to collaborate in the project actually provide better visualizations for the rest of you know in terms of providing charts and that sort of thing in terms of i wanna not this TPS but latency and individual transactions may be due sometime verses latency graphs as well on the eggs don't have time to do all the program
so alienated evolved out of the correct answer here and I'm having any questions until the next speaker was 278 me so questions about what you want away and the swells with similar to this in early March if I wasn't the blocks reductions like the instructions for the documentation and it's a block serves only available at this is of size exon of the how many actually cutting back on their commitment to class and I think it has and what and focusing more on what they call hardware is a service so it may be that x have less options now than they had last year and I yes I know we haven't been there the rate complicated from the fact because of all the black seeds of configuration that you can take right on like variable acids what instance size is a bit on you know sort of how is it can figure out what kind of storage do you have and how much of that you have on and that ends up being you know by a sort of complicated axis recommendations on it would be nice to actually do other things would be nice to serve the younger than the book only benchmark size because it's how we're coming these commitments 1 of the following is actually did you know discover like I said is and it's worth going fishing for a better instance never never if you could be putting something production never do it in the 1st instance should get get an instance run some sort of synthetic benchmark of chapter you haven't gotten the instance that's got too many neighbors or is on a 6 year old machine on or has some other problem because the variability in performance between instances even the same cloud is actually quite substantial on and if you put the certification without you know scraps if you're good at this sort of fishing you can sometimes get the instance that's better than the median in the pull from and that's a real bonus because you're paying the same amount regardless of how good the actual instances other questions OK with thank you very much