Materialized Views for MySQL

Video in TIB AV-Portal: Materialized Views for MySQL

Formal Metadata

Materialized Views for MySQL
Using Flexviews for MySQL
Alternative Title
Mysql And Friends - Flexviews
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Year

Content Metadata

Subject Area
Scaling (geometry) View (database) Query language Fatou-Menge View (database) Multiplication sign Authorization Parallel port
User-defined function Functional (mathematics) Sequel View (database) Coroutine Set (mathematics) Insertion loss Mass Parallel port Mereology Mathematics Software repository Query language Authorization Utility software Data structure Scripting language View (database) Plastikkarte Bit Mereology Motion capture Coroutine Demoscene Mathematics Buffer solution
System call View (database) Multiplication sign Quantum fluctuation Execution unit Coroutine Set (mathematics) Database Insertion loss Water vapor Medical imaging Mathematics SQL Server 7.0 Bit rate Mixed reality Computer configuration Synchronization Different (Kate Ryan album) Query language Set (mathematics) Row (database) Partition (number theory) Sanitary sewer Physical system Scripting language Mapping View (database) Block (periodic table) Interior (topology) Data storage device Database transaction Maxima and minima Motion capture Instance (computer science) Sequence Arithmetic mean Symmetry (physics) Order (biology) Right angle Quicksort Resultant Reading (process) Row (database) Server (computing) Functional (mathematics) Overhead (computing) Pay television Table (information) Sequel Variety (linguistics) Computer-generated imagery Maxima and minima Number Element (mathematics) Revision control Goodness of fit Cache (computing) Natural number Subject indexing Software testing Statement (computer science) Hydraulic jump Metropolitan area network Overhead (computing) Server (computing) Projective plane Counting Database Coroutine Multilateration Axiom Limit (category theory) Mathematics Cache (computing) Subject indexing Query language Function (mathematics) Network topology Revision control Statement (computer science) Vertical direction Table (information) Pressure Oracle
Group action Constructor (object-oriented programming) View (database) Plotter Multiplication sign Demo (music) Coroutine Set (mathematics) Insertion loss Water vapor Complete metric space Dirac delta function Parameter (computer programming) Food energy Neuroinformatik Mathematics Insertion loss Synchronization Social class Area Scripting language Theory of relativity View (database) Building Interior (topology) Database transaction Complete metric space Message passing Order (biology) Interface (computing) Right angle Ranking Quicksort Resultant Row (database) Point (geometry) Aliasing Functional (mathematics) Regulärer Ausdruck <Textverarbeitung> Table (information) Sequel Cellular automaton Local Group Queue (abstract data type) Software testing Selectivity (electronic) Message passing Summierbarkeit Plug-in (computing) Addition Scaling (geometry) Key (cryptography) Demo (music) Information Interface (computing) Cellular automaton Expression Content (media) Counting Correlation and dependence Coroutine Group action System call Integrated development environment Function (mathematics) Network topology Calculation Statement (computer science) Table (information)
Point (geometry) Sequel View (database) Code Multiplication sign View (database) Aliasing Set (mathematics) System call Product (business) Number Loop (music) Googol MiniDisc Right angle Procedural programming Table (information) Stability theory
again is my time start some of and starting 11 told it by little extra well that's is fine by me and my name's justice lots of I work at pagoda but I also author of flex views which isn't realize the solution for my scale and start query which is a starting at scale and parallel query solutions so we can
just an awful lot of In this of questions so authorities as part of smart tools nice slots a make sense but I have like to wearing some utilities like a buffer pool happening scripts and the scene at EDF Soviet arbitrary-precision mass you use the BC functions like BC-AD a B C sub inside my SQL but is it is a materialized view to OK so it says set of tools the 1st is plexity CBC insert change data-capture utility a talk more about it there's a little bit and their but sequel API instability eyes there's sort routines that allow you to define the structure of materialized views the so what's a
materialized view well materialize means to make manifest to actually store something somewhere right so unlike any a regular view a materialized view is
stored in a real table you define SQL statement the the select statement that uh is in the returned data and is stored in the user's disability cash right and so these the tables Our or dealers use available other databases I systems DB tuples lies great tables of Microsoft sequel server calls an index to use Oracle thousands snapshots of materialize use vanilla version you're using vertical some projections so this is a common database technology is not actually available in my SQL that doesn't have been materialized views the closest thing you really get my SQL of tree table I select right that's the store the results of a query into a table but there's no way to go back and update that table 1 row changes the axiom of view that is account star over a new billion row table you had a single row to that table yet the read count all those billion rows and want to get the results back but let's use doesn't have that limitation has the binary what consume reflects CC so can automatically update the materialized views so why would you want to use them well speed is the biggest reason right so the results are in a real table again so that table can be indexed or partition door you know do whatever else you need do it to get good performance as a result is taking a complex queries account star over a billion row and you know just look at the table it the count of rows and so it's basically instance then failed minutes or hours exons great for dashboards caching important result sets and anything that really needs a lot of speed it so this is the cash rate like other caches like memcached the caskets out of sync when the underlying data changes so we have to have a way to refresh materialized to you and then refresh should be as efficient as possible we don't have to rebuild the whole thing each time the bet is always an option the but so get 2 options of is 1
is complete which says they implies a completely rebuilds the table every time the advantages and later pressures that it supports our joints so if you have a view that needs an outer join you really have to use the complete refresh incremental is as the name implies incremental it'd just applies the changes that have happened since the last time the view was refreshed now water joints but now we randomly select Star Schema anatomy doing our joint anyway it supports most aggregate function is the only additive function it's not fully supported his grouping cats and if there is demand for that I'm sure I could but but it uses the row changes that have been collected since the last time the view was refreshed recessive incremental but it is very easy to set up in Dallas 1 hurdles you set up like CDC it's a set up scripts PHP you need to have a based binary log a jump 560 have to have full row images and so the lecture you don't use the short images indeed I recommend a read committed transaction isolation of all the reason being is that they're told us use insert selected instance like cast a gap walks in the of we can avoid gap blocks is better so use we committed it the what you set up the reflects the senior running in the background is randomized them that set up SQL to run the subscript traceable perplexity C uses the my SQL been log commands to read the binary log from the server and answers running in the background and so if if the server supports it the maximum unsupported it'll just the biological was sent to the server president connecting and again when it gets inserts updates for the weights along with the changes into changelog tables that's so you have to create symmetry you log or change log on your table was to create log it's done automatically fill up with all the changes that are happening to that table so the natural question that people normally ask is why not use triggers well 1st off that they had a lot of overhead light a lot of overhead so ideal for that they also can detect commit order the binary longest serializing can order which is very important for using it replication and for obtaining the views but you can't create triggers inside of stored routines so I could never create and the log that would actually create the trigger on the table because you can't creatures and sort routines and finally my ask only allows 1 trigger table so if you would have made a different trigger and that's outside of 570 mark you can I only have 1 sugar and that's been a big limitation so it is then reads the binary log externally and updates these tables so here ever create table the it my would have a create tables then at the very simple table 1 column you call flex views create and the log tests 1 so reading on test the 1 and then we insert a row into the table and then we can select from its and the log right so this when we were created in the log of tables populated with common belongs in maps test 1 2 MB log 75 to ever and and the 5 of them that they will a man and the skin and the reason I do that is the divine comedy 64 characters so that it's to support character table 64 characters feel it would be able to create a material as the log so users and the 5 instead now every edge realize you log has 4 columns that start a table with the ML side the unit of work ID the server ID the global sequence number and then the actual columns from the table right so these things although the good meaning right a inserts a 1 the lead to minus 1 updates or minus 1 poll I 1 excessively by an answer the view of the work that is the transaction ID right you know the work is a duty to turn the the trailers you element comes from the need to that's a variety of the server and then the global sequence number every change it's a new global sequence numbers of update while the old image of with global system a to and you move global sea with summer 3 the so that's how it is that some places orders changes that a transaction the document
fast so creating until I see this what we do well we have to use the stored routines to create materialized views but she's doesn't have its own built-in sequel 1st like a shock radius so instead of but it knew any select statement haven't passing the statement you have to build this statement up with these last reverting calls that really simple they're all documented online so my GitHub account and uh so you build up the sequel statement to create a view and and in the very easy understand doesn't take a long time to get used to creating these things right so every edge realize you has an internal materialized UID and an immutable has a primary key and the IT so the retailers intialized UID you create that initial idea using Flex use that creates the fix'd but the 4 parameters like the scalar name the name of the table to store the view and and then the refresh our to use Committee either complete or incremental the also a complete example at the end do incremental 1st so the sets in the equals last insert the cell and adeptly use less energy was a question of mind and that the practice talk so once you have a and the view you have to add tables to rights of Lafitte but his that table that materialized UID disable the adding running test he 1 a given alias right the last parameter is the joint plots as an honorary using clause that since this is the 1st table there's no joint class so you can see here we have a 2nd example with testy to alias to and there's the on class for the joint and support using as well so we have to add expressions to the sequel statement right that's up that's in years so I placed area where plus so here we 1st a group expression ever group on C 1 ever gonna call an alias C 1 right so it selects C 1 and C 1 and then we add another expression to them to realize you for the count so count star the article alias CNT now we have everything we need now we have a we have a from clause and we have a select clause need where clause to build this particular view so I'm just going to go ahead and enable it to complex use that enable but the the ID and that will create the actual table that contains the data for the deal here you can see we have 2 rows I answered a bunch of data into this table in meantime there 2 rows 1 into primary key 1 10 C 1 values and the age of a little over a million rows as an insert select a few times Philip table such critical right we now have this table that contains the contents of our select stated that we just built so what happens when the legends well we have to update the materialized views so was no longer scale so gets the or all-round dates so we periodically refreshed the materialized so we use the city insert the new value value to then these update we don't see to internalize right so we have to refresh it so if we run the real select we can see that there is that to their right so that's we noted Saturday so you use but he not refresh you can compute the changes into the Delta Devils or you can apply the changes from the delta to the view you can do both the reason why has a two-phase that calculation like this if you can periodically the computer changes the view especially the table is changing quickly telling every 5 minutes interview changes which you an update it once an hour once a day and you can update the last parameter to refresh social you is normal in all but you can use a transaction ID to update multiple use the same transactional point in time to you can have a bunch of different realize views that are all in sync with each other so I call but to use refresh give a damn the IDE compute and so they're always changes knowledge is the only changes and we now get a row in the tree lies view delta table right so it's an inserts into materialized view with the the counts are C 1 being added to the count being 1 right and you can see I answered the whole bunch roses millions of rows some ideas and has gone up a lot my transaction ID is not then you apply the delta so once again refresh in indeed when I say apply and then know all the wild later changes and then when I select from this realize you itself you can now see the row has appeared Bensalem and realize he's been refreshed with the changes complete use that you can uh work on the user to that outer joins or use grouping in Qatar or whatever but basically it is users create table selects and renamed table to manage the so here we create 1 demo top customers recall flexed use set definition flex use get IDE will get the materialized UID if you don't know what it is so update the related demo dashboard top customers and then that's the that's the setup customers and the select statement that that actually represents what the view will contain this 1 has a water by those the results in the table will be in the order of the order by which is convenient because as a primary key on the table so it's kind of an automatic ranking ranks you this much is enabled growing can get the get ID and I will just build the deal so but also pluggable if you wanna replicate to red as sort of PostgreSQL well order some other uh sister-in-law you won a popular message queue you invalidate men cash you can use but CDC to do this right so it's PHP scripts as a PHP interface that you can upon around so the plug-ins get insert update and delete information is that in addition to when the transaction starts and ends in a few other things so you can do all classical stuff like greater message queue that's probably too small to read write and so require once upon interface class but the plugin implements like the the bond interface then we have a begin transaction Insert Delete Update before not that after that so you can put whatever behavior you want into those functions and to promote anything you wanna do with your binary log data that every us mentioned about a lot it's awesome there's all kinds stuff the and then there's the quick reference create get bad expression right so really is pretty simple In the past inefficient and that is
be fit any questions the rest of the the all of the rightly and this is this is always a set of have a refresher on commit it's because my school doesn't have committed so I gotta get made out the binary log so it has to be asynchronous the that thank you it's about 8 years old this point and it's being used in production by a number of companies so it's it's that simple the so you was set in the maybe the colonies you can them because of my school that right so they actually become comforted prevents that will refresh and compute views of a scandal says like every minute you can refresh with the with using that the name of this the results of in the but procedures a disk replicated is call calls themselves so as all you have is the stored procedure that the code on the slave uh when you do the updates on the master the automatically replicate down the only time but 2 sets sequel along been binnacle 0 is when it's inserting it into the long tables themselves this create loop yeah from and wasn't but the and