Bestand wählen

Temporal Data Management in PostgreSQL: Past, Present, and Future

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Erkannte Entitäten
they blow everyone my name is Jeff Davis I work for Aster Data part of their data and I will be talking about temporal data management and post tests where we were a long time ago where we're going to today and I where we're going in the future so 1st all
start off with a kind of a really simple words would seem like a simple problem long time ago Moses trying to devise some kind of you know sort of audit trail of and auditable log of what changes have been made to the database and over time so you can imagine and in this case it was a system involving a fixed assets so if an asset was reassigned you know from inventory to a person or from 1 person to another you want to keep track of that not only the action itself or where the asset happen to be at that time but also but who may have been reassigned reassignment in the database so you can figure out what happened but it turned out to be a little bit more complicated than I thought because of the limitations of the system at the time but the you know primary query for for any database from there's there's 2 sides to 1 is you know the representation of the information in the other is actually being able to query it in a useful way and suppose presses able to represent those changes pretty readily with triggers about doing
that the query itself there weren't a lot of mechanisms for them and this is the whole problem that is kind of white got me started down this path is that I felt like this problem as seemed like a typical problems and seem like something that should be able to be handled efficiently but it really wasn't at every step I felt like was awkward In particular queries seemed awkward and they
can perform well I so you know 1 the things about him I mean if if you've done this kind of query or you know trying to you know match up you know what happened to 2 records in the past that and trying to you know specify a date ranges over which that was valid that kind of thing but how many people have written queries like that out of the field you have pretty much everyone so you know a couple things 1 of 1 of them is that you need to be really careful with their less than greater than 1 and you quickly come around to a convention usually to kind of keep track of the better but you have to be pretty careful on you know there's a tendency to use animals for 1 side of the equation but then those don't work with the greater than less than comparisons you really so that as you would want them to do so just as final on 1 side the great you know the queries that working before for you know and so yeah you gotta be careful of moles and then there's also other awkward things like the stronger represent single points of time or or empty ranges of time
and then I'll quote performance on glossing over this this a little bit because this this talk is more about you know the entire problem space but then if you you know look at how query like this might have worked on you know quite a while ago Saint 8 3 a 8 4 and then you know you look at this and basically I just went right right this example actually tried to get to choose you know different land was able to do that mapping and because a lot times that can be reasonable I plan to execute this query but you know for that even it's a little bit tricky to get it to choose that plan oftentimes will just use 1 index even if there is an index available on both the kind of from the time and it's easy and the 2 times and so but you have a query like this this this kind of query look familiar to most people room yeah it's typical temporal query to the right find a you know when you have you know how might have looked at some point in time when somebody made a decision and and so this plan is just not not really workable but uh and you know even and even if the optimizer is able to choose a slightly better plan that land isn't really great when either of em from here
basically just gets worse so I mean this wasn't the end of the use case I think that if this was kind of the end of the problems that they you really want to solve without temporal data management then I think most people can get by you know there's usually ways you can work around it and manage the performance problems another race but will see the problems that little bit more complex there's more you want to do and eventually becomes unmanageable
so I tried to improve the situation through through a few of the projects I'm quite a while ago I started and I just wrote an extension of and it was to make a bit called period was a period of time basically only operated with 1 sub type and that was a timestamp with time zone which is the typical out of the temporal types offered by post press the 1 you ordinarily choose as a point in time and ask so I wrote the datatype around that held both the beginning point and an end point in time and that allowed a couple of of advantages of course that made the query simpler and it made them indexable because of those stresses extensibility system having those 2 things together in 1 data type and being able specify than necessary index support around it allowed it to perform and perform better by I don't have that I meant to the end of the
list of a few of these other things that is kind of the set a list of key you know where we started some features that have been added some features that are in the process of being and released and where I intend to go in the future but for
the period data this is kind of where I started in this provided kind of 1st that solve the immediate problems and like I said making inquiries up faster and allowing the index infrastructure to be used and it obviated the confusion around greater than less than signs and by holding the inclusivity or exclusivity of the bounds internal to the type and this this is actually superseded by range types which tell are you going to look a little bit more detail and since this was superseded I don't wanna describe the entire use for this this data type of course and that is work started this is making use of those ability to get us there are and then something
that couldn't be done with both stresses extensibility system at the time was also a seemingly pretty simple of problems and so no matter how you know how much I tried to make an extension work to solve this problem I couldn't so I I decided that we needed a core feature to provide an answer for avoiding schedule conflicts and out of all the features that I sort the time that really needed to make them world work this is the 1 thing I thought that there there's no way of avoiding at this must be in the court and so that prompted me to start writing some changes to corpus press and on there you can think of this kind of like a unique constraint but it's more flexible in that it can allow actually prevent schedule conflicts so also example here and
I'm in a 2nd but to describe it essentially unique means that if you have 1 couple and in that means that there is no other couple with an equal value for that attribute anywhere in the table and that's declarative so that means the system enforces it and you don't need there's no other logic you have to follow just the existence of the Temple of the value of 5 means that no other couple has the value 5 for that unique at tribute exclusion constraints allows you to kind of change that operate away from just equality but to say to an operation is very important for can quarrel which is overlaps and so on linear move on to the
example here I think it'll be more clear on so you can create a table with the definition like this and then I'll get to this this in a 2nd here this is hard to develop the kind of simple example and also not use the range types from from our previous work from our later development Joint get on to later but this year this TEST z range that's what supercedes the the period data types and that's what gives you that beginning in and and the indexability and and this 1 and curricula that is you know the same based on timestamp with time zones and so this this create table and has these 2 extra constraints down here that operate on this value in a special way on this attribute of special way and I said that the key is that were treating in a room and speaker just with equality so if the if the room matches and then the time period you got the room over all laps with somebody else then there's a conflict and that's what this constraint is describing so with the quality you have to say only if they're equal is there conflict in this example you can say well if the room is equal and during overlaps then you have a conflict again this is declarative so the system enforces it internally inefficient way and you don't have to do anything more you just declaring this to be true and then it works of course neither you can't have a the same of room book by different speakers 4 overlapping periods of time but you also can have the same speaker booked at 2 overlapping times in different rooms so really you have to separate constraints here that's why you have this exclude twice 1 is saying that the room get be double booked for overlapping periods of time the other is that the speaker can't be doubled looked for overlapping periods of time that's a pretty coming up pattern pretty common constraints and so is another common patterns of the the equals 4 1 attribute overlaps for another 1 1 you know the the room in this case indicates you know 1 resource which is the room and this is the time attribute here and that indicates that and this complex valued it has the beginning and end time so I went through level that quickly and some are there questions on this this feature here or why this is important events so this is not new and 9 . 9 9 . 0 1 but but as you can see I use the example with the 1 9 . 2 because so much easier to illustrate the UCI used extensions which make this easier to actually just go create this are directly from my slides because you don't have to have enough to show you how to install tree just you install using a package system for conferring on the operating system and then you can just do this and section and that simplifies things yes a nice and while this idea 9 0 this could be used there 2 things from 9 to used steer 1 of them is like a extension and that simplifies the because it makes the example more and more self-contained right and the other thing from 9 to I used is this range type here which I'll get to those in a 2nd but you can think of this like the period datatype that I just described so in 9 . 0 you have to use the period hype instead of only reside input in this slide is because I don't want people at a copy and pasting using the period that we're thinking that they needed to this is that the new rated do and it's more flexible and and so you can see we can see that but not so as as you can Dolezal that on the water in which to put the put these features 1 of them was you know ordered by the time they were developed and but yes so that's that's that's slightly confusing acknowledges that the root but this is basic so this is the attribute with with the constraint like unique you list of pollen and so you would have unique I and user ID right up or let's say unique guns in a user I. D. and you know come out from the drive some of some other you know user ID and remind you you match mapper
something like that and ever let's but with this in this case you have something a little bit more because room your matching with the quality and this 1 your matching with overlaps so he by adding the flexibility to use other operators aside for equality then we need a way to specify is so with is a key word and it's just a way to associate the operator would be at tribute and that describes the constraint 1 9 out of there and I know you can do this kind of thing and so if he actually specified everything as equals then it would be unique constraint but if you specified everything is equals that would have the the same behavior as need constraint with Knowles which I have I remember correctly it's just that basically anything that evaluates signal does not cause a conflict so for instance with the constraints and then you can have multiple values for the same attributes that are associated with the yes yes exactly and then you think yes because they were people most of us and this this is actually because you're using it just index can all but it's actually because of both of these because you need you need some index of need some indexable operator in order to enforce the constraints that stem internal implementation detail a little bit more time but you need but the index and that can support the search conditions so just can support this overlaps but with this extension just can also support the quality of the tree can always supported quality and but in this case you can mix and match so I needed 1 index types could support both the quality and overlaps and then by by installing there's then just was able to accomplish that we know that the cause of all were yeah and it's possible if this is you know common use case and this is not necessarily my decision and this at this point it's not really matter technology as much as where we decided that not if it it's a point where enough people are confused by having to install the extension and we would probably make that change but but then again we also want people to get comfortable using extensions because they're they're gonna be pretty important and so you know and but that's that's not really a technical matter and but it's become very simplified by making just 1 1 now and that's a Dmitri work that he did the exceptions were at the but if you have everything is equality so that this semantics are the same but they in the performance on you know some ICA benchmark that I did that was you know basically just trying to measure that that exact 1 piece of the system I saw about at 20 cent difference something in that in that range from and that's due to minor difference in the in the in the in the implementation of but it's actually very close in the implementation as well as the semantics of but there is a minor difference due to and they need to do this kind of 1 extra index probe after you've inserted it essentially doesn't get 1 optimization of the B-tree index that gets and so there is a difference there but it's pretty minimal and when you have a larger case it would ordinarily be a small part of the overall time it would be my expectations water or there's no semantic difference and their a performance and I would have to think about that a little bit there shouldn't really be a difference there they wouldn't be expected to be identical because the way the Treaty that certain away just indexes built this is a little bit of complex and you want somebody but out for performance questions I actually like the kind of but those to be handled that just because I was
I'd like to put a fierce more of these these features but here's an example of the exclusion constraints and action you know so you have from 1 2 3 I don't have much room to even you know people error message here with your messages pretty nice and it tells you what actually conflicts and it gives you all the slides information on so you can track down where the problem is I and so this so this here is kind of a straightforward example of of you know double booking a room for different speakers the course of the speaker is the same at the fire exclusion constraints on because it would violate both so
this is a simple way to avoid schedule conflicts and and it performs as far as I know it's it's the best performing weighted publisher because during the least work and I has the finest grained locking it won't block things unless there is a real potential for public which is which is nice but similar to the unique constraint if you're inserting 5 and 6 doesn't matter if they're all the same B-tree pager not 1 what block the other there's no way 5 and 6 reflecting the constraint and it's less error-prone I really recommend avoiding if if you aren't using this mechanism I really recommend avoiding trying to kind of work around it and improvises solution with triggers these solutions are pretty error-prone number ways it's it's easy to think you got it right when it's wrong and it's easy to think that OK well the topic of big blocks for now and not here for performance and then later decide optimizer because you actually do care and it's wrong and so there's there's a bunch of it falls on Sardis this recommend against that I and in
range types so this is this is why gave Davis going power use that TEST z range in last example this is a generalization of the period types so now we no longer have to to have 1 thing we don't have to go out and uses of the extension of the also that are the extension for the period datatype was somewhat limited in number of ways but the only work for 1 data type and so range types of are generalization that can work for many data types and providing again the the key aspect is that it's got a beginning and an end and then a bunch of operators in the index support necessary so as long as you got older datatype you can use a range it some are built into the system and and if not you can easily create your own range type out of some other datatype whether that subtype is built in a rather that itself is an extension you get from somewhere so this this
replaces the just before this replaces the period and so it
also offers a lot of these built-in range types so now rather than only using timestamp with time zone as the period datatype had you can also use it for dates regular time timestamps without a time zone up war or non-temporal types and then if you have some new type or new temporal type or a slight modification of how you like the range of that time to work then you can create your own range type there also offers a lot more flexibility and then we need to find a new function that operates on ranges then it will work for any of the range the range types that you have and then we need to find a new range type all the functions that they you define will work on all so in other words if you have a date range of time-stamped easy range and you define a function that works on a range of time step t z you automatically get it to work on data and so you don't end up with that kind of explosion of of operators and so forth and you can do a lot more than simply the management of world that's so kind of a
simple example this is I mean sometimes you have some you know say a hotel reservation might be by day check in check out in a conference room corporate reservation might be for a very specific beginning time like may 17th at 2 30 PM and so in that example you want timestamp with thousands of arranged logically they're very similar so you don't have to try and manage these 2 different data types and worry if you have all the operators for both you get them both here and you can manage them in the same way but just a different granularity and then after these cases since Caesar reservations of course you once exclusion constraints here as well but it doesn't have among my slides to include them so
I questions about the range if that sorry i yes you can define flow ranges and the floating-point timestamps of I recall there are maybe a few caveats there i'm because well for a for a number of reasons I guess but I have to think through you know for for for some users it's probably fine enough to think about that and I think it's it's probably OK but I'd be a little bit worried using flow of values missing exclusion constraints or some other hard requiring like that of but if you're OK with the fuzziness of floats enough to use them and and Europe here with the just running queries on women not trying to define declarative constraints I think you're OK and that you to publish up by creating a range type with floating 8 is
the subtype and they're not if you did that you would get all the operators available for the other relation types automatically and be able to create indexes on them automatically 0 yes and the
range types and it's got it's not represent original work it and it does not use normal semantics is treated as you know an unbounded range where you just don't specify anything as there treated more like infinity then like an old so although sensibly Knowles are supposed to can mean that you don't have anything there then they're kind of some strange semantics defined in the standard involving comparisons so you don't know what we don't want to use the semantics and so we treat it more like infinity on water heaters that that that's right credit they so
this finally text will be
available 9 . 2 so that's what we're pheno well on its way to being released but but it's not quite a bit of
so these things only talking about from here on out are our future ideas so on these are fairly hand-wavy at this point and I can likely will take various different forms and you might see them on the slides but it's no problems that would like to solve that I think are important and that would you know help and make temporal data management easier in a number of ways or faster to use so in this case are arranged he's and range for and he's so the idea here is that you know there's a concept of foreign key in a primary key on a table for the is this a quality but for ranges as of we're saying before the other operations that are more important for a range values and so I have arranged he would be for all range values in there you specify overlaps and as part of an exclusion constraints and from on range of attributes you specify quality and so that would give you you know if we can move back to this earlier example
here then it would kind of simplified this because since this is a common pattern we want to be able defining arranged the room common during it would maybe some annotation to tell it that during his arranger maybe you can pick that up on its own but the idea is that I had that allows you want to just by you know do that the common case thing this becomes a little bit of syntactic sugar and under the hood with is the exclusion constraint that if you need more flexibility of course you could go back to the exclusion constraints so I have to be good example of what I was going to move out of range foreign keys that the example range for a few so here I was talking about arranged he where is specified in the table may be referenced or made may not be but I'm just to avoid the necessity of in these in these are a little bit verbose just because there 4 but sible right whereas this common case here would be you get to specify with range here would automatically adjust because of some syntactic sugar is is what I'd like to have the arm and then have something that's
us offers a slightly
newer thing is that you so
range for and he would do more like what like what you're suggesting which is in the referenced table you have arranged the in the referencing cable you would have a range foreign key that would mean that every value in the by referencing side would need to be contained in the range on the referenced size so this would be kind of the the range equivalent of a foreign key the range analog I should say of a foreign key and that would allow you to I'm very simply that you know if you don't have known 1 table so you know trips Figure taking that might be for at a conference that all the activities that you have in some other cable must be contained within that time that you're in that in that city and so you can imagine a system like that yes I of course are I wouldn't either 1 is actually makes sense from a logical standpoint and how that specified here that might be something that you could you know so if you had this this is a little although open how you specify that exactly but absolutely there could be done and yeah he's he's 1 the of the right that's that's that's that's a good point in in format it would certainly be supported right that's kind of easy it's a subset of all of the other cases so I I mean if you if you really needed he's actually just make a singleton range anyway some of that of that time point which is 1 of the reasons why 1 of the sport's singleton as well on the range cases to make those those cases more interchangeable so but you is have the the the the at 1 of the reasons of all of the and that was on that that would be up my inclination be toward invalid and on the referencing side if you had say other non range attribute mean that the typical case you'd have a non range actually long range right so you have I D 1 2 3 and the reference and on referencing side and so you know in that case really on the reference inside what you want is you want to combine those those 2 couples because you you want ordinarily and I mean this you know you can kind of do whatever you want the mindful nation would be that on the referencing side you want to collapses ranges in 2 1 2 3 4 you know 1 come for inclusive to be a larger range so if I 0 I think it would a huge issue yes but I think I misspoke but either way you want those 2 ranges that are adjacent if if the other attributes of interest servicing you want to collapse those into 1 range and that would be my inclination although perhaps it could be the other way around but that would be my inclination is that you want to kind makes to makes some assumptions that you know that the referenced side was these kind of continuous nonoverlapping ranges in the hall he yes so if if they're the exact same ranges you can actually use ordinary foreign key yeah yeah but you want I think that the that contiguous non-overlapping and is in an interesting property may be and whether we want to require that are not enough to think about how difficult it would be to not required to or whether it's too confusing to not required required but that might not that's been brought up also as another potential constraint we can enforce this can be enforced currently that you know the idea that you know you can and you know you can't add something that's adjacent and so that might be another constraint that you want to use fork force add something that's adjacent with all the other attributes of interest in the same time so other Russians about the range of foreign keys arranged he's so another feature
that I'd really like to see and also kind example query that applies here this is this is important I think joins right now that equal joins are and are well supported the system right there's 3 different ways to do an act would join and maybe more than that of the count to the variants of and are that's that's very well supported but for joining ranges of cultural where in a 2nd but the idea is that you match up based on the portions of the range that overlap so rather than joining on a quality this is kind of a spatial join actually but the idea is imagine portions that overlapping this'll be an important for billing queries or actually
many different degrees but I think that the most straightforward as the inquiry like this where I you have maybe have some kind of rates in 1 table and usage another for some service and so somebody might starter stop using a service at arbitrary point in time that I should say at arbitrary points in time and end up the rates might change at arbitrary points in time as well as or perhaps based on the contract maybe the exchange of you know an hourly or daily basis or something like that so then I the usage could stand you know multiple here multiple rate periods of war or a period could encompass you know many users so you want to do a join and can take the portion that falls into that rate period and prorated amount and multiplied by the rate so that you're able to actually get a useful build them and there are many problems like this I just thought that this is the clearest case but you're doing this kind of matching based on the portion that overlaps on each side you know another example this might be if you're doing kind of fuzzy join and you want to kind of take you know say a log message versus some event that happens that's observed somewhere in the real world and you what and correlate the event with you know that's recorded with the system's time verses and another events recorded with some other systems I mean you wanna say OK well up at a window of 5 seconds on each 1 and now you have you know these 2 events that are actually ranges now and you can do joins on that as well then you could do that because he joined in that in that way so that the idea of a
really but I call range were joined by the goal that is what that pretty fast so that's that's our prior gotten the design of approximately idea of some ideas but that I've up you know proposed in some kind of free proposals how this might work but it turns out there's quite a few for ways to approach the problem and we'll have to see which way is to look the most promising and you know perhaps all of them have merit and in which case you know there may be multiple approaches would be implemented but the goal here is to make those joints faster because I think those are going to become more common in temporal applications truck you yes in in what form high just through the you would be my question would have houses what kind of partitioning going 0 I would like that very much yes I'm I'm not working on that but I think that I especially since range types are in the in the course now I think that and I think that's a very good idea that can hold metadata about the partition ranges from maybe right we maybe even able to use the exclusion constraint to make sure they don't overlap and you know but have for range partitioning I think that that makes a lot of sense but I wrote really like that that idea and I haven't gotten much further than that then just thinking OK this would be a good way to represent the metadata there are a few other problems with partitioning of course that we haven't settled but that might make and that change alone might make petitions scalable to more and more scalable because we'll be able to exclude faster much faster than doing the kind of of the constraint exclusion that we're doing now which I'm sorry for the terminology there how about that constraint exclusion is the partitioning terminology for excluding and they start a constraint but it's different than exclusion constraints and so on anyway but I I like that that approach and and other is not here and nobody has written the code to do that but I know I believe that some people are working on that problem I but I don't have much for operation then I like it I'd like to be done but I'm not planning to work on a personally but I might obviously involved to that and so I was that related to emerge Shauna was that some of yes so so back to the range were showing that I think of the idea here is to you know again this 1 of those cases us talking about the beginning where to do this you know in in this is long time ago would be very expensive and it is able to think strange types it's
a little bit better it's able to use median index nested loops and nested with an index on the inner side but but if not for that it would have to
and actually uses nested loop dislike for an non equijoin words this is a very you know slow kind process so I think the range types that help a little bit with this this query here but we
really are going to a a range
were trying to make those useful for and a larger the logical problems are more complex problems and bit you know little bit more and analytics on this so
by another of feature that I'd like to see tiny going back to the original example of this can be done but I kind like to see the standardized and I think a lot of people have some very similar ideas what they'd like to get out of the historical table or some you know like times that's really as far as requirements goes we need a historical they belong to and there maybe if you specialize requirements like whether the user is included in in the whoever created a deleted that record and other things like that the the trigger might report along with the data itself and the time and but of a question of the and I believe the time travel option you're referring to is based on transaction ideas come and that maybe associating those with and with physical times I am a little bit unclear on how that option actually work but it was related to the transaction ideas and when they were committed in Canada if you're able to do them a snapshot back in time all over right yes and so this is a little bit higher level than that I for 1 thing what people really want to do that for the entire database unless it's a very strict environment usually would just be a few tables that they really care about that have something to do with money your physical assets i'm or something like that of so the other thing I would say is that you're able to collect actually more information this way because you have information about the users that actually performed the operation whereas if transaction ID snapshots are all that you're using and then you know you don't have information about users and higher level you know access control related information so I think that you know this I'd like to see the kind of you know solution it's easier you know that can take on this simple requirement but maybe a few options and just do it had the system take care of it and and I think that that would make it much more common for people to do it for 1 thing and it would make the space of the obvious so I think it would kind of make people more aware that it can be done rather than needing to use a trigger something on their own but and would kind of standardize the way it's done in that would you know kind of developed you know some expectations so you know 1 person wouldn't forget to include some critical piece of information because it would be kind 1 the obvious options available to include it and people don't know what to expect from a performance standpoint another other angles and they just be handled more standardized and other thoughts on this or questions about them having at a table log is this is something that other people here would add value is is kind of a quicker way to establish an audit trail for a table is within your or work you know or we go over the and you probably block changes to the historical logs that would be 1 of the things haven't worked out all the
permissions yet but I know you probably block changes through history itself but there's a bit of work for the opportunity to work right water it's what how all but I don't think it's going to be a perfect solution there that's going to econometric lending there's no you got a history there CYT field columns maybe could add new columns here that lights and I I don't see a reason why that would be a you know prevented on but you know maybe you need to kind of record the fact that exactly when that call was added so that that way you knew that maybe some historical records not only is it just no but that they never even had the option at that point there's something so there might be some extra information quantity and but I think you could still allow adding columns but faces the nature that you're keeping all those old records means you really don't wanna remove columns and all right priority who you are and what we work with but that I would call that have a special requirement any idea you know along with you know when you offer something declarative to add a history log and then I personally I think it would be a bad idea to been also allow way to subvert that because then when you go back in time you're actually not seeing the state as it was at that particular time right you're not it's it's false so maybe if you have special requirement to do that but I don't know if I include that in the in the video is all about it's all probably trigger that I mean I unless there is some reason you know not to I mean I think I think a lot of things can be accomplished with the trigger here but I mean the the is the is the gasoline minor league and rephrase this is there some limitations terms that think all you know I mean it's it's an implementation detail this case and I think that you know part of the hesitation use triggers is that they're not declarative right some but if were able to offer something declarative using a triggers implementation detail but I think that they lose some of the more some of those concerns would go away because they understand that it wasn't just doing some crazy thing had a method to and in the interests of the problem with this you have 2 more along with the with the influence of a lot or power or an object property rights people's so all the way to 1 object agree I mean I think that I think that's kind of business requirements if you want historical pricing you you have to store the whole record in Canada you know do some sort of the normalized and all that all that information out from the other table and stick it in the original 1 or just blog above and in that way you can do it in historical can join and you know the the thing on it I think it may in in quite a few circumstances in this particular case I think would you be able to do this kind snapshot query on both of tables to get the old historical data but that could be done maybe with an index and a standard that that snapshot somehow it after you have that you've highly filtered then at that point you that's all care about is the act would join of those 2 subresults coming back together so I think you can do that with an ordinary equal although but there would be very similar queries that I think range merge join can be very useful for so I think it's very close to the case where you want range word showing that I think you can get away matches with undergraduate in in these instances the change from all products so that if there was but yeah maybe I don't if you wanted to do but adjoining where you've got some results from many points in the historical time rather than taking snapshots of each I guess as soon as you you do snapshots of each at a particular point in time they knew at that point can be made non temple world but if you're not doing a snapshot of the what like kind of full history of of of things going on and you want that joint result than the at least I believe that's true in the output of the role of each it about that for a while but I can hear how of the it how around what was all yeah so that OK yeah I think you could do so you're saying do it to do a range join on the 2 history tables to come up with a kind of a new history table of the hall of all the records of features that you know of fisher telling you this from yeah this again this is the problem that is that you know of that would need to yeah I think but that gives you have a better way of describing I think too about Christa's question is that you know of more intuitive way of looking at it is you could take the 2 history tables and join them and end up with the history table of as though you had just 1 table originally history that and so you could kind of l that the joint of the history tables is the same as the history of the joins table you know I I think I think there might be a better way describing you know something about 1 more kind of fun I
didn't throw up here also got a couple minutes and so this is kind of a little bit of a little bit out there but it's been requested quite a few times and I think that there's a there's some of merit here it's just that I think that we need to better understand the semantics of this and the use cases for those semantics in enough detail to really is to really find out what we want here but the idea is that you know you have many different ranges you're joining them together or you unioning them or a reading them in a in a union fashion and but obviously you can't have any existing range types you can have these multiple disjoint ranges that have a gap in between that just isn't right representable today and in range sites this is essentially saying well yeah you can even have gaps in there and it's kind of like a bunch of ranges that are all for water from left to right none of which overlap and none of which are adjacent to all of you know have you heard on something about that little bit of it I don't know if I can quite response that 1 of my the dog in order half of is the of the the of the right that you want agreed a bunch of ranges adjusted repeat what he said we're really discussed this very briefly but if you're you many ranges near aggregating then where the interrelation operator is the union of of the range so you can imagine rather than like some aggregating on plus this would be this other ranging in integration you know by combining arranges with the union then you don't want this kind of 1 range that kind out you know out there someplace to throw an error because you get union the Rangers because they're not touching any of the of the of the of the great chain of being it will right yes yes so you can add subtract cut out the middle level range so this this would allow you to do those kinds of operations and I think that I like I said I think I think there's something here that's been brought up several times but but I would need to understand the semantics in these cases that need those semantics a bit better to you to really do a good job at something like this and I'm sorry I can't hear you very well right now but then we need a big new data types for those return as well right is now that that in the domain has expanded right so with great right but that's that's a good question but I mean so part of it is you know with the array the array doesn't have the constraints preventing say overlapping Aura Misr order ranges right New York nearly wanted ordered in a non overlapping inspect that constraint would be difficult there's another but there is another problem with that if I recall of but I mean essentially you know it's it's similar to that that's also been a reason why I but it hasn't been implemented already is because we have this array of ranges from and so that can solve some of these use cases but it is the right we have all of that frame rate was over what our body yeah that would be that would be an interesting operations of fully I think I think probably the way this Leo this is out of the ideas I've presented I think this is the the most of you know kind of least were semantic standpoint them the least understood and so I think 1 way we might have to wait for week to see a people use arrays ranges to solve these problems and then see where that see where that is weak and then see if we can make a use case for multi ranges from I that the the place so I that think that to trying to record from the during of the year the year and here it says that the start in of the 1st object to in both everything up and then uses any happening is the same point maybe it's better represented as a picture of it and so of you on so that so arises in the context of the well in this repository about the nested data structures so that the along the so I I I think and I think that's a very good point right it's there's not all that many interfaces that offer that much and that it indicates the complexity of the problem I you know and therefore higher limits the use cases so I I think that I was responding only tongue-in-cheek but I would like to stress reliever theories here the the all actors that you all all is In general the number of so we use the theory of what people have in the equal but it's all but there's other I mean I would I would say but there's I mean from a theoretical standpoint there is nothing that prevents you from having complex sites of those public sites can be anything and so I would say that it's I wouldn't reject on theoretical grounds but I think that you know from a best practices standpoint if you find yourself using lots of multi ranges and the ranges in your base tables you know then it there's probably some design flaw there there's probably a schema design problem but and that they are more typically useful as intermediate results because of the closure of so I would I would agree I would agree partially workers and you
have high I haven't had these these things that people are worried about it but it is not the norm so yeah I think there's a lot of the connection between what I can work it out and spatial the spatial data types and work that that those 2 guys have been working on and you know what I'm hoping to take advantage of some of that and they they showed an interest in some of the going to look for growth and some of the right you have on your you're talking about the range merge join our yes I think I I think so I was thinking about that a little the during the talk but I would need to think a little more and I'm not quite sure about the respond to that on my feet is there's not now I I have so but we have I would like to spend more time thinking about that because if it were able to do those similarity joins more efficiently and his they're using it in a nexus with essentially and so if they had a better way of doing that I think it would work with some of their similarity searches and some of their strategy is the yeah I mean that but I think you can do it faster with some kind of a joint and then with the index an approach that they were using the government another perspective on that was 1 of my because my proposal that I have submitted the hackers involved a pre in Boulder pre-sorting plays but but that is not the only approach but so that's 1 of the things that will need to discuss and so that the algorithm was kind of you pre-sorting based on the 1st you know that the lower bound of the range and then doing had emerged strips strategy young and hesitant I there's always some kind of free you know processing of the data at runtime right so if you have like it's going to be involving some kind of a temporary index or store or something along those lines so I'm hesitant to say you know that wouldn't do anything because in a way creating temporary indexes that sorting in a way so so anyway but I think he's here I'm going and there's a right yeah I mean I wouldn't think of it very free because I mean it's not an order that would be semantically useful in some other way so it's not like we happen to have a sort of already because that's the order they put it in a rehab another sort of this way because of it happens that the user wanted nite order anyway because this would be have a strange order right by lower bound and and not to be useless but just that I I expect those opportunities to be less than what the Query planner is geared for finding already but I mean it's all kind of open for discussion and I think of you know to that center but you know I think it's a planet and find the sort order adding that the Tories there I think that this strategy that I propose to can have a lot of promise of that of you have a sense of the book is being used by the use of work plans or what have you been here most of the people or the to through the also of the analysis is based on what I know where that we're all people don't get is like the variance of the use of the word and it is that if you don't know what it is so this sounds like a multi range kind of use cases eventually OK so that I think but I like I like that of a use case and I think that the expense you'd be able to have cut pieces out of the schedule and search within their yeah but yeah I think I think that's a good use case of often about a more but I think that a direction
Algebraisch abgeschlossener Körper
Formale Semantik
Gebundener Zustand
Analytische Fortsetzung
Array <Informatik>
Einheit <Mathematik>
Ordnung <Mathematik>
Tabelle <Informatik>
Algebraisch abgeschlossener Körper
Formale Semantik
Mathematische Logik
Weg <Topologie>
Endogene Variable
Ganze Funktion
Disjunktion <Logik>
Tabelle <Informatik>
Konvexe Hülle
Offene Menge
Komplex <Algebra>
Wort <Informatik>
Prozess <Physik>
Natürliche Zahl
Komplex <Algebra>
Regulärer Graph
MIDI <Musikelektronik>
Wurzel <Mathematik>
Lineares Funktional
Physikalischer Effekt
Disjunktion <Logik>
Vorzeichen <Mathematik>
Konfiguration <Informatik>
Spannweite <Stochastik>
Automatische Indexierung
Projektive Ebene
Web Site
Physikalische Theorie
Inverser Limes
Operations Research
Speicher <Informatik>
Leistung <Physik>
Einfach zusammenhängender Raum
Physikalisches System
Design by Contract
Objekt <Kategorie>
Temporale Logik
Kategorie <Mathematik>
Güte der Anpassung
Temporale Logik
Zeiger <Informatik>
Kontextbezogenes System
Dienst <Informatik>
Rechter Winkel
Automatische Handlungsplanung
Analytische Menge
Spannweite <Stochastik>
Steifes Anfangswertproblem
Inklusion <Mathematik>
Attributierte Grammatik
Geneigte Ebene
Kartesische Koordinaten
Virtual Home Environment
Fahne <Mathematik>
Deklarative Programmiersprache
Prozess <Informatik>
Figurierte Zahl
Funktion <Mathematik>
Nichtlinearer Operator
Gesetz <Physik>
Verkettung <Informatik>
Funktion <Mathematik>
Strategisches Spiel
Matching <Graphentheorie>
Einfache Genauigkeit


Formale Metadaten

Titel Temporal Data Management in PostgreSQL: Past, Present, and Future
Alternativer Titel Range Types and Temporal: Past, Present, and Future
Serientitel PGCon 2012
Anzahl der Teile 21
Autor Davis, Jeff
Mitwirkende Heroku (Provider)
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
DOI 10.5446/19033
Herausgeber PGCon - PostgreSQL Conference for Users and Developers, Andrea Ross
Erscheinungsjahr 2012
Sprache Englisch
Produzent FOSSLC

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Range Types and Temporal: Past, Present, and Future Range Types didn't exist before, why do we need them now? How do they work? Why is "Temporal" important if we already have timestamps? How do we apply these concepts before deploying PostgreSQL 9.2? What's left to be done, and what solutions are in the works? I'll be asking the audience these questions, so -- Err... I mean: I will be answering these questions during the talk. Extensions, changes to core postgresql, and future ideas will be described in the context of solving a simple use case from 2006. These ideas build up to the larger point that powerful types are important, and database systems should do more to support them.

Ähnliche Filme