Merken

How Postgres Got Its Groove Back

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
is the next big thing and as as proof of this just before I got here I noticed that my
tweet had been favored
and I was adjusting entries talk about respond in a rapid and hipster has just retweeted might read about how exciting hardly interact as CEO of the universe stresses that a lot cooler 1 but let's just open up the Pabst Blue Ribbon of databases are so I have
posted stats to back I am
Peter von Hardenberg work on group composed REST Newman stress to the internet and and doing this for a few years and I work at 4 0 grew
weary cloud application delivery platform and our mission is to help people deliver better software faster and that's why we offer PostgreSQL stress is the best database there is a don't offer and the people of that and we're very happy PostgreSQL around 500 thousand or something that you have to count as you get anyway I little for Katharina talk about 1st and I go through a the last 15 years a prosperous development in about 15 minutes and a minute down to the utmost extent and then I am going to the top of the world marketing for open source and we talk about pain points and existential threats and what I think those mean and what the differences and then going to talk about some ways that you can help us stress had its roots back and to uh you continue to grow in the successful in the future so this is my very own
personal selective history for stress of mind of the release notes all the way back to the beginning of
the 7 series and hopefully it'll be a nice walk down memory lane for some of you and for those of you who are relatively new to the community I hope that this will kind of help to explain helpless pressed calculators so let's begin with
7 . 0 this is 2 thousand but the the 7 . 0 release was the 1st time that foreign keys were implemented and it was the saw the introduction of the sequel joins text so for each release I think that if you sort of larger high-level features of tended to not include performance because every released the incident of performance work out varying degrees but I'm just calling out kind of things that a user might notice a new capabilities in each 1 so these are very early days of President around 5 years since the shift from 12 to post rescue well when a group of people query language and it's really so immature and the following year introduces toast and which if in the audience is that the best thing since sliced bread as this manual and toast is actually innovative feature which hopefully none of you have ever had to think about because what it does is it invisibly and transparently moves large data out of the role and stores it elsewhere in a compressed form and this is really awesome because allows post-stressed to be able to ingest larger datasets than previously capable of just to get a sense of kind of how young the databases at this point outer joins are not implemented until the next year this also see the introduction of the right had log which is such a foundational technologies that it kind of blew me away that it had been there since day 1 In
2000 to PostgreSQL useful back vacuum but before 2002 vacuuming a table meant taking a full table lock on it rewriting this made it not very operationally convenient and the transaction wall was removed a previously every select statement here and created a transaction and beyond that once you hit 4 billion transactions and you were done there was no way to go on a so I imagine 7 . 2 is a big relief to cover it was the learned that the hard way I was wasn't here for that part this sort of unsure there's some great amount of stress on but it also saw the introduction of statistics it's really remarkable looking back over these release notes you know just how basic the database was in those days when you compare to the power and flexibility of has this is the beginning of internationalization it's still very very early 7 3 introduces he was a nice little cherry picked 1 here the ability to remove a come from a table theories impose press 2001 you can do that but prepared queries and dependency tracking so the you know ability to say drop table cascade or maybe even more usefully the ability to stay drop table and the 1 that you really should the zombies and stuff 7 4 is the end of the 7th series it introduces a regular expression rewrite forces tons of performance work and all of these I think when you see a later release just assume that's because most of the work was improving and polishing and and so on but but I want to call this so it was in 2003 that auto vacuum began to really be a part of the project I think this is when it landed in control so to sum up the 7 series this is a period of about 4 years and the focus is really on removing limitations like not being able to vacuum away bloat not being able to go past 4 billion transactions not being able to do this not equal to do that and also really just filling in the basics at this point was stressed is now a relatively confident if less performant kind of my sequel alternative I would say kind of comparable level of completeness
so the 1 of the 8 series 2005 and post-stressed gets Windows support it is really remarkable to me that I can say is that you just learn to take these things for granted point time recovery is introduced in in 8 . 0 this is quite a big deal if you have some kind of a problem in 1 go go back in time tablespaces major sort of system administration requirement for a long time i . 1 introduces roles of prior to that users and groups and permissions but there is sort of a post press uh special thing this introduces most of the sequel role standard tons of performance work it's not until 3 years later now that all vacuum actually makes it into the core and it's still not on by default for those of you who are waiting for things like logical replication it really helps I think to look at this as an illustrative example of how the process community marches sort of inexorably but occasionally very slowly towards also 8 . 2
introduces concurrent index and again that's kind of a nice thing you don't look table but it introduces new array functionality and and I really want to have call attention to this moment here in 0 7 because this actually sees each story PG crypto land contrib is well and that's kind of a big deal because this is the 1st time in the release notes going back that I see people begin to really take advantage of that kind of data types and extensibility of post press in a big way but by this point sort of running in parallel PostGIS is out there and it's around but to me the introduction of these things in the control is sort of sign that post press is beginning to take this concept of being the extensible database and a much more serious way 8 3 then follows up with the a trio of other similar features of full-text search for places to search to and lands in core from part of corpus press the the XML data type is introduced and we've been fixing the security holes in the that's the multilevel sense and that the UI DOS the library provides the UID functionality which by the way these all have grossly underappreciated features but it's not until here 2008 it was what
I both 3 that auto vacuum had its 1st post press releases and it's not until a weight that land 0 6 that lands and it's
not until 0 8 that it becomes on by default so congratulations to all of you have worked on that and my condolences 0 9 this is actually where I I I use postcursor bounced back in the early 2 thousands but this is where I sort of start to really pay more attention from the community and to some extent so I'm I'm a little less for the details of some these things 84 brings in windowing functions which if you like I signed the release notes for at 9 3 have better documentation now which is good because this is an incredible incredible feature which if you haven't figured out yet so it's not surprising that documentation is really difficult to digest once you get used to using it it's sort of like and I think he's or with statements land in a for our warm standby that's that's ability to have a prosperous running receiving wall logs and waiting in case of an emergency becomes possible and also introduces I think the 1st real push towards improving visibility in the post press performance which is still an area where there's a lot of room for improvement because the PG stats statements and they also explained features like now again I think I was in
some of the these theories by talking about advanced equal a new data types but I also think there's an increased focus on operability and transparency that's sort of epitomizes the set of release releases and this is a really big thing I know this is this is PostgreSQL and of starting to reach maturity I feel and go 0 yeah at the end of the series will really see I think is a database that is in a competitive with many other offerings in terms of capability but also beginning to explore where it can go and chart out its own unique and distinct identity and of course that leads and where
we are today which is the 9th series and the 9 series
introduces hot standby which is if you're not using it today it basically a very important feature since it lets you run a read replicas and road traffic to and this gives you all kinds of desirable capabilities above and beyond just fell over it also lets you isolate analytics it lets you we have heterochrony customers run a relatively small database as their primary just to take rights to it and then have several read replicas that they distribute traffic across so that the analytics guys can have you know a playground which won't affect the production side in the production site and never slows down right so can scale up and down as it needs on the regret because this is this is a really big deal and prior to 9 0 1 of the most common complaints about post press that you heard from my sequel users was aside from its slow was also doesn't really have replication and no offense again to those who build things like along the students those were not in war and therefore difficult to approach but it also sees the introduction of PG upgrade scares a lot of people have PG upgrade I think is a real game changer because it means if you have a large database you have a path forward that doesn't involve you know 24 48 hours of downtime to dampen Astoria database or you know worse you know multi week multi month engineering project to set up some kind of that's rare occasions scheme to get you over this is this is a really big release I think in terms of making PostgreSQL truly competitive and I think that's why the 9 . 0 label is really a great thing I think I was told that was actually hot standby was such a big deal that that was 1 of the 11 this is actually when
sun acquires story Oracle acquires acquired by looking at the at have end and I think in the community among some audiences there's a feeling that a big part of post-crisis success comes from my sequels decline I agree that there is a component of that you know a week a week competitor creates opportunity but I don't think that that's actually a fair assessment of space doing well you know I think the the real changes are already well under way the big the big motion
forward to being this amazing database that we have now is already clearly sort of staked out in community progress here and I think that the
sort of my signal acquisition and you know I having some acquired by Oracle meant that my sequel would never really be allowed to become a competitive database with you know the that the big contenders but that doesn't mean that post vessels only on the path to greatness before that I think this just helped accelerate that
so 9 1 introduces sink wrap the synchronous replication support post press now it is unparalleled by any database in the world you can control it per transaction there's all kinds of different durability guarantees that you can choose that this is unlike anything else to stand up call you need to have like some crazy enterprise license and syncretic your entire database back and forth that's a major affair isotopes prep on cluster not too long ago it was like 2 lines in a config file and review our research navigator and that's really great and again 9 1 which is the beginning of the foreign foreign data represent foreign tables in read-only form and this is a theme throughout process which is that localization you know I've color per column coalition support here on a previous release introduced per-database collation support and previous release before that introduced per cluster coalition support so you can see this march toward ever greater generality throughout all projects and foreign tables is on that now 9 3 introduces writeable foreign tables and that's you know that that sort of culmination of this project is also the 1st release it introduces extensions which I think are stresses secret weapon in the days to come what I call a more about that and I'm going to without index only stands cascading replication right it's still improving gossiping tap range types which was really amazing new contribution that no other database to really deal because it's a composited that envelops another type and it's so simple finds the work that you can do and it integrates into the advanced indexing systems that process has in a way that is really unique and not conceivable for other databases to implement and it was that the nature of post-processed community in base was such that although many people contributed to this patch it's this focus on composability of the components of stress that really made this possible in a prior range sites there were a tight time intervals Jeffries was a term for the thing before right temporal or period so that the range such a generalization of this basically which says like where you can have a range of anything and with the exclusion constraints was which were in 9 2 as well as and 3 9 1 9 1 with with with exclusion constraints which I edited out how you can actually create like a really powerful and flexible scheduling system other you also have other really cool use cases that you can fill and you can do this uh in a way that takes advantage of all this great index we also see the beginning of G. Sunday 9 2 9 3 introduces Jason manipulation functions but if you go to a entailed or stock I not sure who is presenting them sorry there's a bunch of new exciting work that is leading toward you ever greater power in that space as well as down pipelines 9 3 of course committed to a talk on new 9 3 features can think of retreat we've got materialized views kind of and again this is the sort of the theme which is the materialized views of landed it's tremendously exciting but they're not very useful yet because you still have to manually update them and the infrastructure is in place however and I'm sure that in the future this will be transparent and beautiful and wonderful but in the meantime this is a major step forward toward that again Jason functions you can now reference into adjacent object you can create functional indexes on it without needing to rely on something like POV 8 for a new query language this is a big step forward from the last release where we had a g sine but really it was just you know you could declare colonist assignment is validated you had son in the column so it's a big a big step in the right direction again foreign tables has been readability and 3 this is huge this makes poster someone described versus future as being you know potentially a data broker post-stressed sits at the center routing all the data through places distributing queries out all the edges of you're you're sort of data graph 1 of the most striking changes i've seen natural during in the time that I've been working in the database Face is that people don't so much choose 1 database anymore they choose a variety of databases for their needs so for example Andrew Dunstan was talking about parents are EDUs with PostgreSQL as the people combining on Amazon's Dynamo DB with PostgreSQL we see people just sort of pull in all cases today that SF Puglia just last week had an engineer from strike which is a payments company that runs everything Raymundo for some reason come in and talk about how they built a pipeline from Mongolian to post press which allows their analytics people to have the power and flexibility of prosperous just hold in a combination of the of the data and and various other but also 9 3 gets regular expression indexing which is just so cool and I had to to call that's a really exciting thing so we wanna talk at a high level so what was the 9th series all about is about read scaling and being able to do more than this scale in your database that extensibility we really see the database community started to embrace this creativity and in the same not you know what is the standard say or how are we going to catch up but what can we do where we want to take this thing what's next and this is from me an enormously exciting I can't I can't see where this goes so
I wait you're probably saying
but can you over simplify this even more having digested 15 years databases into 15 minutes yes I can I put stress 7 I think was really all about foundations and durability was about just getting a database that works now of Chris 8 series is all about adding functionality and performance this is where that slogan the world's most advanced open source database I think really became an accurate description of PostgreSQL line has been about replication extensibility of was best can be about you went to some talks but I'm going to sort of give my assessment of what I would expect according to see a huge explosion in community extensibility PG xn is an amazing project because it's as it grows and matures it's going to make it possible for this new extension work to be available to everybody but I think we're going to see massive scalability the logical replication work which today just missed 3 which is too bad in 9 4 when it lands I'm sure people can go this doesn't quite meet my needs but just wait it will this is this is the way that PostgreSQL works 1 year you take shorter steps forward 1 at a time inevitably toward something different and I think visibility is is increasingly thing that we're seeing pgce stats statements and I think an area where post is really weak and it has been weak because visibility is not much of a priority until banks prior to the chairman of the easily treated me metropolis few
this class because it's a question is there anything I think that visibility becomes increasingly important as more and more and more of those present solutions get out there which were operated by by application developers and end users as opposed to having the support of really skilled DBA and even visibility in the hands of a skilled VBA is so much more powerful when you have good tools that agree example of this would be the and 9 in 2 extensions to produce that statements which allow you to actually see for example by a query you know which queries are consuming the most time in my database and are consuming the most I go across the how many times this query been run and that's that's a window into your database that was basically impossible to get forces is really remarkable improvement and it will continue to see that that expanding OK so now
we talk a little bit about how we got there which is sort of this this shift from taking this you know young open source project build good foundation and then and now sort of open opening the doors to new kinds of northern new ways to go forward article about marketing because if there's 1 thing that the competition has been better at and process it's this as an answer by saying a few words about adoption and why you should care even though it works fine for you so
well can and taking this is granted you are using prosperous rid OK so maybe also
building post that's that's definitely a thing but the people
who build PostgreSQL post-stressed so companies that run PostgreSQL investing and if you have no users than there are no developers and if you don't have any developers and there aren't any patches and you don't me patches no
prospects so there's a there's a chain from getting people into post-stressed from expanding the community from focusing on adoption that leads to a better PostScript's but it's not always easy to build the new and so the several marketing for open source which is really how can we
raise awareness without spending money and because we are we are an open source community and fortunately people do this for
us and they do this in a way that I'm in is described using a marketing term called the final how many people know what a funnel is or what it means just OK allowances get out when I'm going to come the real time it is
I'm sure the sort
few right half of the other class the funnel so is a marketing term which describes have basically the process whereby I somebody goes from some initial state to a in traditional terms of sale and in the context of an open source community what we're really talking about is what are the steps that it takes for a person to go from never hearing about post press to coming to PG contravening contributor to being developer in the project and how can we understand those steps and how we focus on improving them and there's certain places in this process I think the process community does amazingly well and there's certain areas that I think I could use improvement and and the way that the reason it's called a final is because you think about it as basically you at each step if you've never heard about post breast you're not going to install it once you install PostgreSQL a certain number of people are going to get confusing give up before the user if you start using it you're going to get scarier confuses get frustrated and then you stop using it and if you get into production you can run some problem you can't fix and then you move off and so you end up with this shape that basically comes down and the goal of this what's called optimization is to say like OK can we expand the top we get more people exposed to this but also it's not period if the people who you're exposing to what you're doing don't then take any action on it right like if if everybody's heard about prosperous because it's that database that like you know took down a building with somebody then that's not that maybe a must you can somehow spin that to people and bicycles and so let's talk about this this is
what I would sort of describe the final was looking like for an open source project an open-source community you begin with word of mouth have people heard of then you have to get from having heard about it to installing it somewhere to to to to try and once they get their power the 1st 15 minutes how many times have you install a piece of software and within 15 minutes so I don't I guarantee you most of the software you trifles America if they make it through that 1st 15 minutes what they do what they learn what are the challenges that they come across and then once they get from that into joining the community will they stay are they going to you know become advocates are they going to get out there and talk in the community or are they going to just kind of like fall out because they have some problem with that of the solver something else the Chinese comes along and the so
word-of-mouth let's talk about this post has a great reputation what does it have a great reputation for is powerful is feature full and if few typos stress is very or post press really or things like that and Google a lot of people talk about helpless stress is slow because this is not a slow why do people think that because other databases have basically differentiated themselves by beating faster and faster than what faster than PostScript Mrs. public perception lags a long way behind reality you have to be out there and talk about these things but I actually don't think that the slowness suppose this is really a big deal in this by from I don't think the slowness is really big liability in this day and age I think scalability is a much bigger liability when I talk to people out there now let's talk about Mongo instead minority B is known for being performance it's known to be easy to get started and I want to include 1 sort of bad word here and 2 in parallel tests also known for being unreliable why go up there and talk to people a lot of the time I say hey what what database to use and why did you choose you know what what led you to it and increasingly I hear a ton of people say that they use longer especially in the No . js community which is 1 of the fastest growing communities of software developers anywhere and I think that you ignore that community at your own peril because the unification of the JavaScript from the client to the server side to the database is really a very powerful concept and I think that a it seems to be getting a lot of traction right now it's already I think growing fairly it's definitely growing faster than languages like Python Ruby a half hour lecture notes it's impossible to predict and in the node world Monroe's king which means that if node becomes the next big language then no 1 will ever try process because they're just locked into this other communities and that's why this work around the Jason datatype NPOV 8 is so important and if it helps to raise awareness and get the Tripulse press and in my experience when people try to stress that tends to correct so what is the installation look like that's the next phase this PostgreSQL August cost people get things to package managers and get what's that experience look like how we optimize that you do apt-get install stress on 1 to great experience if you're on a Mac and you do homebrew install post-stressed it's terrible next thing you know you're like modifying kernel parameters and rebooting your system then if you do that run your system starts and you know to call blue screen but it does that if you're on using a service provider like a heroic new or engine yada AWS right Amazon doesn't offer PostScript today which means that if you start by choosing your service provider and you plan then to use them to render database if you chose AWS 1st that's the end right like post processing unit consideration so looking at these different areas where people are adopting databases in understanding how that fits into this uh you community successes of that and you know how how to anyone else uh at any rate uh people have to get more stressed from somewhere before they can do anything else so let's look at PostgreSQL and in in the spirit of being a marketing person for the day is they're called action on the post press I what is it does use good words to describe what the database is what those words that we want people to think about when they think about PostgreSQL advanced powerful scalable flexible wonder how many of those words will you see when you go to the home page and then who's looking at the conversion rate and running experiments to try changing that page or talked actually led to some answers but here's a screenshot from
the home page from what days ago so the 1st thing you see is this blue thing supposed rescue all I think that really helpful put it up on the projector like this because it blurs the details somewhat which is you know your eyes are drawn to particular things you see I see a big elephant that's good you don't see the word database right it's actually here in about the point letters as others other good words advanced an open source pardon 9 point I'm sorry I apologize and maybe my shrink the screen size and you see that the beta that's great is that the 1st thing we want people to see when they come to the poster as page it depends who you're targeting to if as a community if we're focused on getting existing people to run the beta then this is a great choice I don't know what the conversion rate on this is but I don't know how to how you would I don't know who would run this test but the question is when a new user comes are they grabbing the beta because that's the 1st link that says download you know when I go a web page a lot of the time it it's kind of a page and I see the big but it's a dominant national because I know that I want the thing and the thing is there and I read like 3 words and I put them up right like that's a great experience and there's a lot of information here you know there's some good stuff like a future user that's really cool this some quick links to latest releases which 1 of these 2 I want why it's hard to say we just have like a button that takes me to just thing anyway by optimizing web pages as a whole of art and discipline unto itself and that's an area that I think the open source community I would do well to build expertise in because it really helps to drive more people toward using your software and as we know more people using software means means high and the
few
who body when using a software means more people finding bugs means more people suggesting features means more people getting into the developer community means more patches means more prosperous so it's really important not to neglect this part of the process and wire
people choosing post is another good question understand but and you want a developer services choosing them because they hear about things like H. Storer posters that they can't get anywhere else are they doing it because it's a framework or provide a default roku has really shifted the Ruby community towards post-stressed because we had a fortunate accident of choosing PostgreSQL and we didn't think it matters which 1 we chose this 1 standard easier to operate at the Django community of recommends post press 1st the of databases Data Kaplan losses said the founder of the genome project has said that if he had his way he would need support any other database and that's a big deal in building bridges to those communities is a great way to get more adoption listen to them ask them questions what they want when they heard I know 1 channel that does drive people toward is you know when they had a bad experience with another database they go and they look where should I go and I had a conversation with consultancy called pivotal which is well known in the Ruby world this is generally the 1st thing they do when they meet a client with among the databases that them of process that's the 1st process
there is another good thing to look at and the problem is we don't have data about a lot of this stuff or at least I couldn't find it and who's choosing post-processing in Python and Ruby I actually am reasonably confident from the data I've seen that stress is the preferred choice however with PHP and Java developers what about No . js what are the languages of the future where the areas that are growing and can we be the fall there because if you know if you have a seat on a rocket ship period of
and talking about the 1st 15 minutes I my sequel guys were really bad that this man would be has completely copy the playbook is all about the time to 1st aha where you really get why this is a big deal and why you should be excited there ways to help them that we could do low-hanging fruit and post this is making big strides here are actually specially thank Robert for committing the system 5 shared memory cache line 3 that means that from now on new users try and prospects for the 1st time the 1st thing they do is go some bizarre system kernel control config and then reboot the computer that is often
also this that that kernel parameter configuration right let all kinds of problems down the road which were mysterious and difficult to debug despite the many great efforts people like this page to streamline that now it's no longer a problem and once they get on board how do people learn how to express how medieval have you suppose this tutorial to learn post of 4 or 5 of you that's it and how many of you did that in the last decade few 0 so how then do people learn about the features postprocessed I like to talk about the stuff that sort of got some flash to it like storage a Sun but even just like the stuff that is now taken for granted how people learn about good ways to use foreign keys how people learn about how to do things that are you know really really painful if you don't know they exist but have you ever find out like save points and if you using PC full and you don't have a back slash X space also in your . peaceable file which you probably also didn't know existed this is a great feature which automatically reformats the output of queries to be you know why the toll depending on what comes back from the database but how would you ever know that that wasn't even an option option right how you make these invisible hand make them discoverable I'm told by many users that the output from back slash h of x such H which lists all those commands in the sequel terrifies they don't know what 90 per cent of those things are they decide what I do white wise backslash d + not the default right like these are important these are things that if you use posters every day you forget you sit down with users you talk to new users and they just have no idea what's going on when people have problems have they solved them I know they can is only 1 road into as the community continues to grow not the when answering this question so how do you
solve problems how did they go to production was the production ization checklist look like is there a checklist the user can say that got got post-stressed I wanna launched waterline deemed to do you put that on PostgreSQL work can someone like foot print out thing that someone can go down the checklist says like make sure that this is set to make sure that that is set a set of lots of great toxic conferences that talk about this I don't know of a good canonical place to send people ask me that I and of course 1 thing people ask a lot of does it scale and the doesn't blend of our time so once they get through all
this once they've managed to hear about post press if they heard good things once they managed to install it if they manage to get their hands on it once they manage to figure out how to use it once they managed to put in production are they going to stay yes it's this is pretty so let's talk pinpoints
an existential threat so this is sort of 2 categories of areas that you can improve outpost and helping go forward but which have a different kind of time horizons and also a different kind of that the absolutely of different different kind of characteristics so pain
points out some examples of some bad query plans can really ruin your day user is getting frustrated by things like kernel parameters can really turn people off absurd is just kind of a no-brainer that we should do at some point materialized views people come from other databases and now I really don't have that so about these things
affect adoption and they affect attrition what they are not likely to do by fixing is really change the game in terms of you know what is the scale of the addressable market for post Processes software project you know
on the other hand what an existential threat but I'm pretty
sure the code as well marriage prospects going away when I say existential threat what I really
mean is is there a future where smart developers with interesting problems should probably choose something else you smart developers with interesting problems are there is the right choice not to use PostScript right now I think that in most cases from many many use cases possibly the majority of use cases the right tool for the job is PostScript that's a really really cool place to be but the world is changing right if you focus only on solving pain points and you know optimizing that experience eventually you get past so what are the kinds of things that could lead to comparable as an example right used you stand still long enough in the world news on people write cobalt today but nobody knows smart developers of interesting problems are better off using Cobol for their projects and also the kinds of things I talk about when I say that
is 1 of the way that people build software changes what people you know the relational model is beautiful but it's obscured by simple to the point of almost and recognizability but underlined that there is this beautiful set theoretic manipulation which has a wonderful properties so but if world switches to thinking in documents the world switches to thinking I can grasp if if people stop trusting in databases because they fundamentally can are found not to scale well right it's very expensive to maintain consistency on large datasets and biological meaning gigabyte and in a petabyte right and I don't know that are consistent databases at that scale that do significant transactional velocity and maintain total you know referential consistency and so what happens is people say why can't use a database and have to implement this stuff in applications and they learn to deal with the problems elsewhere because they have these needs that can be met you know that this is this
is data overload is is a real existential threat to this project right datasets are getting bigger and again I want to emphasize I'm not saying that the use cases that are out there today are going to go away people use PostgreSQL for 50 years I have no doubt in my my point is that the will they be using it to maintain their legacy systems you know in an emulator or whether the building new amazing things on comes and whether or not the kinds of challenges arising in this space reason to and that by the project and I talk about you know the rest petabyte project know how do we how we get the 1st unmodified cluster with a petabyte of data usefully in was that place there's a long road to get there take years you know what it when we talk about massive throughput was it look like to have a cluster that would serve facebook right and and as an aside Facebook is really interesting cancers because on my sequel and this looks sequel cause it was the right choice this because on the mice equal because marks after birds in those 1st 15 minutes found a database of work for him and they never reach something that they could solve a long way so here are some ways you can
help me process grow time I think in part you got discovered but for the rest of us and study the new users they have extremely valuable ignorance if you're in this room and you probably are blind to most of the problems go sit down the is someone who is using posters for the 1st time and take notes where does it hurt get out there in the community and give exciting so there is an enormous amount of really cool stuff in post-stressed that people don't know about it and whether it's traditional like that traditions the wrong word whether it is a standard compliant like relational algebra amazing like window functions and that kind of thing that is just like the bread and butter of the relational databases strong area people don't know you can do that or whether it's the crazy out there like 0 my god here you using a foreign data wrapper around the Twitter account with these these things excite people when you get out there and talk about it makes people reconsider what it means to use a database that makes people rethink what they're doing after when you get out there and talk about foreign data wrappers and not here right people here already listening go out there 2 and 0 . js conference and give a talk on things that they will be interested in talking repeal the go out to a Python conference and don't just talk about people kept on talk about just amazing ways that Django can better leverage the database young agenda mailing list and you submit a patch for you know advanced prosperous functionality that's not well-supported today introductory user content people ask me all the time where do I go to learn about using PostScript's they they've got a service provider like Enterprise DB or Baroque or they've got a sysadmin who you know can can run PostgreSQL for them what they want and I was like how would you find out about PGs that statements how would you find out about PG static to the have you learn about when the user generated just index this stuff is not super well documented it's there in reference form because 1st manual is amazing if you know a thing exists how you learn what's possible and get out there and rights and data types there's so much low-hanging fruit here why is there no URL type I wanna make an index on host right I've got refers coming i wanna find all the ones who are referred to from google . com I can do this with crazy regular expression magic but like URL is not yet implemented this is this is non-trivial your else can be tricky but these kinds of problems right like it's not like all the easy stuff is gone it's not like you have to implement you know like crazy obscurity physics things to even find something to work on a tweets as a data type of this just easy stuff to do why is there no SI units of datatype which you know make sure that I'm not accidentally added meters and pounds you know why they're not know why they no datatype works like a range type wraps another type and make sure they're not converting you know miles per hour and kilometers per hour and multiplying those together and getting something bad and then crashing my rover into the surface of Mars you know if we if we give people this kind of functionality and then we tell them about they and this 1 is is another kind of like hilarious embarrassment I I think there is no known sample database out there there's there's no thing that everybody refers to that like shows all the basic kind of features and functionality of PostgreSQL go above and beyond kind of like my sequel stuff they really embarrassingly the only thing that even comes close to that is a port of the mice equal 1 I see so many talks where people refer to like the regression test databases or the PG bench databases like this this is it's not the easiest thing a somewhat of done right now but you don't have to be a database hacker to help out here you know make get have proposed tweet about it gets a lot of people on board it's fun and so my my the request is if you if you are told if you are Dimitri or if you are you know anyone of a million other people in this room who are probably judging me now I ask that you continue to design for simplicity and I will say that if if using the feature that you're proposing requires understanding database internals so you lost there's so many users out there who will never be able to approach your that will never really take advantage of what you build if it requires understanding things about buffer caches and everything else as a counter example I think toast is a great demonstration of a case where the few architect something well it just becomes part of people's workflow and imposes no cognitive load on them they can just take advantage of it for the rest of their lives that's a big deal so they represent because I'm sure you tell during the talk and post press has been successful not because of my sequel failing so it certainly didn't hurt post-processed
success comes from sort real user problems but and I think that the greater success post-processing later is because recently is because it's starting to solve problems that don't just applied to a few users that apply to a lot of users if you're thinking about problems to solve for things that will help a lot of people a bit because I can be just as important as helping a few people what there's some great
things we have behind this already stress stable and prosperous has replications the datatypes and extensions are really like phenomenal and and I think that we see huge explosion thanks to the work of a number of people with PD upgrade is a game changer
and there are many great things ahead of us we're going the extensibility continue to grow and hearing stories about pluggable Storage Engines pluggable passes pluggable executed and once this stuff gets opened up minutes can be a pain to maintain those API but when someone in the community can say like I just I need to calculate statistics differently for this use case that I have and I don't wanna weights of what was it myself is on fire I don't wanna wait until the new release of the stove comes out of the new model the when people can solve those problems for themselves then people can start to talk code people can adopt things ahead of the curve in those great things that are out there getting field tested in the real world when they're really valuable we can fold them in the project and if they're just applicable to Nietzsche's case now they don't need to be nice don't have that why many people this this help helped problem it's paid you know there's a great extension you can use it when you have that I Replication Replication Replication this is I think a lot of the like the hardest engineering it's going to go into the next release is the next series of releases is going to be around building out you know bi-directional replication multi-master replication all these different terms for this but going beyond what's capable with 1 node addressing the limitations of having 1 single spinning magnet and that holds all of your business is most precious data and where the limitation on how fast you can write is just how fast that magnet is whipping around inside a box of new data types I think we're going to see a lot of really exciting things start to come up here and of the infrastructure is in place and I think we'll see a lot of really cool extensions the future is going to
be possible or it will be if you make it takes are much
Internetworking
Twitter <Softwareplattform>
REST <Informatik>
Beweistheorie
Endogene Variable
Gruppenkeim
Statistische Analyse
Operations Research
Normalspannung
Grundraum
Computeranimation
Datenhaltung
Streuungsdiagramm
Subtraktion
Punkt
Open Source
Datenhaltung
Ikosaeder
Systemplattform
Cloud Computing
Computeranimation
Open Source
Software
Software
Trennschärfe <Statistik>
Punkt
Hilfesystem
Wurzel <Mathematik>
Softwareentwickler
Maßerweiterung
Normalspannung
Punkt
Schlüsselverwaltung
Datenhaltung
Gruppenkeim
Reihe
Fortsetzung <Mathematik>
Rechnen
Inzidenzalgebra
Quick-Sort
Computeranimation
Reihe
Bildschirmmaske
Minimalgrad
Rechter Winkel
Festspeicher
Retrievalsprache
Speicher <Informatik>
Zeitzone
Schlüsselverwaltung
Hilfesystem
Verschiebungsoperator
Retrievalsprache
Punkt
Prozess <Physik>
Gruppenkeim
Fortsetzung <Mathematik>
Computeranimation
Übergang
Bildschirmfenster
Datenreplikation
Statistische Analyse
Tropfen
Default
Objektverfolgung
Befehl <Informatik>
Statistik
Vervollständigung <Mathematik>
Datenhaltung
Reihe
Abfrage
Frequenz
Reihe
Transaktionsverwaltung
Forcing
Hochvakuum
Internationalisierung <Programmierung>
Hochvakuum
Projektive Ebene
Normalspannung
Tabelle <Informatik>
Standardabweichung
Betragsfläche
Regulärer Ausdruck
Nummerung
Mathematische Logik
Physikalische Theorie
Weg <Topologie>
Inverser Limes
Äußere Algebra eines Moduls
Transaktionsverwaltung
Leistung <Physik>
Tabelle <Informatik>
Tropfen
Systemverwaltung
Paarvergleich
Fokalpunkt
Quick-Sort
Regulärer Ausdruck
Mereologie
Gamecontroller
Wiederherstellung <Informatik>
Speicherabzug
Punkt
Gewicht <Mathematik>
Momentenproblem
Regulärer Ausdruck
Computeranimation
Zustandsdichte
Kryptologie
Vorzeichen <Mathematik>
Datentyp
Programmbibliothek
Speicherabzug
Maßerweiterung
Lineares Funktional
Automatische Indexierung
Datenhaltung
Default
Ähnlichkeitsgeometrie
Quick-Sort
Reihe
Funktion <Mathematik>
Automatische Indexierung
Softwareschwachstelle
Hochvakuum
Mereologie
Hochvakuum
Gamecontroller
Speicherabzug
Tabelle <Informatik>
Nichtlinearer Operator
Lineares Funktional
Befehl <Informatik>
Datenhaltung
Default
Reihe
Befehl <Informatik>
Term
Login
Fokalpunkt
Quick-Sort
Physikalische Theorie
Computeranimation
Reihe
Funktion <Mathematik>
Menge
Flächeninhalt
Nichtunterscheidbarkeit
Datentyp
Statistische Analyse
Hochvakuum
Speicherabzug
CMM <Software Engineering>
Maßerweiterung
Default
Web Site
Datenhaltung
Nebenbedingung
Reihe
t-Test
Fortsetzung <Mathematik>
Nummerung
Analytische Menge
Biprodukt
Term
Computeranimation
Reihe
Multiplikation
Dämpfung
Rechter Winkel
Reelle Zahl
Spieltheorie
Datenreplikation
Verbandstheorie
Projektive Ebene
Lesen <Datenverarbeitung>
Reihe
Arithmetische Folge
Datenhaltung
Mereologie
Mathematisierung
Orakel <Informatik>
Fortsetzung <Mathematik>
Quick-Sort
Raum-Zeit
Computeranimation
Prozess <Physik>
Natürliche Zahl
Regulärer Graph
Wiederkehrender Zustand
Orakel <Informatik>
Fortsetzung <Mathematik>
Raum-Zeit
Gerichteter Graph
Computeranimation
Übergang
Richtung
Typentheorie
Datenreplikation
LES
Gerade
Serviceorientierte Architektur
Sinusfunktion
Automatische Indexierung
Lineares Funktional
Zentrische Streckung
Sichtenkonzept
Synchronisierung
Datenhaltung
Reihe
Stellenring
Disjunktion <Logik>
Abfrage
Systemaufruf
Frequenz
Spannweite <Stochastik>
Arithmetisches Mittel
Koalitionstheorie
Reihe
Scheduling
Transaktionsverwaltung
Funktion <Mathematik>
Navigieren
Automatische Indexierung
Rechter Winkel
Deklarative Programmiersprache
Projektive Ebene
Normalspannung
Varietät <Mathematik>
Tabelle <Informatik>
Nebenbedingung
Subtraktion
Web Site
Schaltnetz
Mathematisierung
Regulärer Ausdruck
Maßerweiterung
Term
Bildschirmmaske
Spannweite <Stochastik>
Datentyp
Retrievalsprache
Vererbungshierarchie
Zusammenhängender Graph
Maßerweiterung
Konfigurationsraum
Leistung <Physik>
Tabelle <Informatik>
Graph
Schaltwerk
Indexberechnung
Datenreplikation
Physikalisches System
Elektronische Publikation
Sichtenkonzept
Fokalpunkt
Quick-Sort
Portscanner
Objekt <Kategorie>
Regulärer Ausdruck
Patch <Software>
ROM <Informatik>
Kantenfärbung
Unternehmensarchitektur
Lineares Funktional
Subtraktion
Befehl <Informatik>
Open Source
Datenhaltung
Skalierbarkeit
Reihe
Statistische Analyse
Maßerweiterung
Mathematische Logik
Computeranimation
Deskriptive Statistik
Skalierbarkeit
Funktion <Mathematik>
Flächeninhalt
Datenreplikation
Projektive Ebene
Maßerweiterung
Normalspannung
Gerade
Offene Menge
Bit
Befehl <Informatik>
Prozess <Physik>
Datenhaltung
Open Source
Skalierbarkeit
Gebäude <Mathematik>
Klasse <Mathematik>
Güte der Anpassung
Abfrage
Kartesische Koordinaten
Maßerweiterung
Quick-Sort
Computeranimation
Gewöhnliche Differentialgleichung
Forcing
Bildschirmfenster
Wort <Informatik>
Projektive Ebene
Softwareentwickler
Maßerweiterung
Verschiebungsoperator
Offene Menge
Patch <Software>
Verkettung <Informatik>
Open Source
Ablöseblase
Gebäude <Mathematik>
Softwareentwickler
Computeranimation
Elektronische Publikation
Open Source
Kontextbezogenes System
Term
Bildschirmfenster
Sichtenkonzept
Computeranimation
Homepage
Lesezeichen <Internet>
Mailing-Liste
Echtzeitsystem
Mehrrechnersystem
Ordnung <Mathematik>
Operations Research
Benutzerführung
Shape <Informatik>
Prozess <Physik>
Open Source
Minimierung
Gebäude <Mathematik>
Klasse <Mathematik>
Gruppenoperation
Zahlenbereich
Kontextbezogenes System
Biprodukt
Bildschirmfenster
Sichtenkonzept
Frequenz
Term
Quick-Sort
Computeranimation
Flächeninhalt
Projektive Ebene
Softwareentwickler
Benutzerführung
Aggregatzustand
Umsetzung <Informatik>
Prozess <Physik>
Wort <Informatik>
Formale Sprache
Gruppenoperation
Dienst <Informatik>
Service provider
Computeranimation
Homepage
Kernel <Informatik>
Metropolitan area network
Client
Knotenmenge
Skalierbarkeit
Datenmanagement
Software
Datentyp
Flächeninhalt
Installation <Informatik>
Coprozessor
Softwareentwickler
Phasenumwandlung
Touchscreen
Leistung <Physik>
Softwaretest
Parametersystem
Open Source
Datenhaltung
Physikalisches System
Umsetzung <Informatik>
Bitrate
Systemaufruf
Quick-Sort
Gruppenoperation
Dienst <Informatik>
Flächeninhalt
Rechter Winkel
Server
Wort <Informatik>
Projektive Ebene
Bitrate
Normalspannung
Softwaretest
Umsetzung <Informatik>
Punkt
Open Source
Betafunktion
Datenhaltung
Güte der Anpassung
Binder <Informatik>
Web-Seite
Bitrate
Computeranimation
Endlicher Graph
Homepage
Hook <Programmierung>
Flächeninhalt
Rechter Winkel
Software
Mehrrechnersystem
Wort <Informatik>
Information
Datenfluss
Beamer
Auswahlaxiom
Touchscreen
Einfügungsdämpfung
Umsetzung <Informatik>
Prozess <Physik>
Datenhaltung
Default
Gebäude <Mathematik>
EDV-Beratung
Bridge <Kommunikationstechnik>
Marketinginformationssystem
Computeranimation
Programmfehler
Service provider
Patch <Software>
Dienst <Informatik>
Hook <Programmierung>
Framework <Informatik>
Software
Mereologie
Geometrische Frustration
Projektive Ebene
Eindeutigkeit
Ordnung <Mathematik>
Softwareentwickler
Default
Softwareentwickler
Formale Sprache
Applet
Fortsetzung <Mathematik>
Physikalisches System
Computer
Knotenmenge
Frequenz
Computeranimation
Kernel <Informatik>
Flächeninhalt
Caching
Gamecontroller
Softwareentwickler
Konfigurationsraum
Normalspannung
Auswahlaxiom
Gerade
Metropolitan area network
Parametersystem
Datenhaltung
Abfrage
Fortsetzung <Mathematik>
Elektronische Publikation
Zwölf
Raum-Zeit
Quick-Sort
Whiteboard
Computeranimation
Kernel <Informatik>
Konfiguration <Informatik>
Homepage
Flash-Speicher
Maßstab
Formale Sprache
Rechter Winkel
Mehragentensystem
Speicher <Informatik>
Default
Konfigurationsraum
Schlüsselverwaltung
Manufacturing Execution System
Funktion <Mathematik>
Zentrische Streckung
YouTube
Hochdruck
Biprodukt
Systemzusammenbruch
Checkliste
Computeranimation
Gruppenoperation
Metropolitan area network
Maßstab
Menge
COM
Primzahlzwillinge
Euler-Diagramm
Plot <Graphische Darstellung>
Ordnung <Mathematik>
Baum <Mathematik>
Manufacturing Execution System
Subtraktion
Flächeninhalt
Kategorie <Mathematik>
Punkt
Horizontale
Charakteristisches Polynom
Biprodukt
Quick-Sort
Computeranimation
Retrievalsprache
Parametersystem
Zentrische Streckung
Punkt
Sichtenkonzept
Datenhaltung
Adressraum
Automatische Handlungsplanung
Spieltheorie
Abfrage
Ikosaeder
Sichtenkonzept
Term
Code
Computeranimation
Kernel <Informatik>
Spieltheorie
Software
Code
Projektive Ebene
Punkt
App <Programm>
Mathematisierung
Kartesische Koordinaten
Physikalische Theorie
Computeranimation
Informationsmodellierung
Prozess <Informatik>
Software
Code
Endlicher Graph
Softwareentwickler
Auswahlaxiom
Widerspruchsfreiheit
Zentrische Streckung
Softwareentwickler
Kategorie <Mathematik>
Datenhaltung
Mathematisierung
Chipkarte
Softwarewartung
Arithmetisches Mittel
Menge
Rechter Winkel
Projektive Ebene
Binäre Relation
Innerer Punkt
Punkt
Fortsetzung <Mathematik>
Raum-Zeit
Service provider
Computeranimation
Eins
Lineare Regression
Meter
Hacker
E-Mail
Auswahlaxiom
Softwaretest
Schreiben <Datenverarbeitung>
Lineares Funktional
Befehl <Informatik>
Physikalischer Effekt
REST <Informatik>
Datenhaltung
Stichprobe
Overloading <Informatik>
Dienst <Informatik>
Twitter <Softwareplattform>
Rechter Winkel
Automatische Indexierung
Grundsätze ordnungsmäßiger Datenverarbeitung
Server
Projektive Ebene
Standardabweichung
Algebraisches Modell
Overloading <Informatik>
Facebook
Physikalismus
Content <Internet>
SI-Einheiten
Datenhaltung
RFID
Gegenbeispiel
Puffer <Netzplantechnik>
Bildschirmmaske
Spannweite <Stochastik>
Flächentheorie
Datentyp
Wrapper <Programmierung>
Stichprobenumfang
COM
Inhalt <Mathematik>
Relationale Datenbank
Relativitätstheorie
Systemverwaltung
Mailing-Liste
Physikalisches System
Regulärer Ausdruck
Patch <Software>
Fensterfunktion
Flächeninhalt
Differenzkern
Last
Caching
Mereologie
Wort <Informatik>
Stabilitätstheorie <Logik>
Bit
Spieltheorie
Reelle Zahl
Typentheorie
Datenreplikation
Datentyp
Zahlenbereich
Maßerweiterung
Datenreplikation
Maßerweiterung
Streaming <Kommunikationstechnik>
Quick-Sort
Computeranimation
Statistik
Gewicht <Mathematik>
Quader
Reihe
Datenreplikation
Term
Code
Computeranimation
Informationsmodellierung
Knotenmenge
Datenfeld
Typentheorie
Datenreplikation
Datentyp
Inverser Limes
Kurvenanpassung
Maßerweiterung
Hilfesystem
Message-Passing
Internetworking
COM
Rechenschieber
Computeranimation

Metadaten

Formale Metadaten

Titel How Postgres Got Its Groove Back
Untertitel Why a 25-year old database is the next big thing
Alternativer Titel How Postgres got its Groove Back
Serientitel PGCon 2013
Anzahl der Teile 25
Autor Hardenberg, Peter van
Mitwirkende Heroku (Sponsor)
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
DOI 10.5446/19048
Herausgeber PGCon - PostgreSQL Conference for Users and Developers, Andrea Ross
Erscheinungsjahr 2013
Sprache Englisch
Produktionsort Ottawa, Canada

Technische Metadaten

Dauer 51:58

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Why a 25-year old database is the next big thing.

Ähnliche Filme

Loading...