We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Learn how to add functionality to the MySQL server: a guided tour

00:00

Formale Metadaten

Titel
Learn how to add functionality to the MySQL server: a guided tour
Serientitel
Anzahl der Teile
55
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Hack on the MySQL code for fun and profit ! Ever wanted to learn how to add this little piece of functionality that you're missing in MySQL but the codebase seems large and thus intimidating ? Then this talk is for you: we'll enumerate the ways to extend and alter MySQL functionality and will get you started on hacking on the codebase: what the layout is like and where to fund stuff and documentation to support you. We will also review the state of the doxygen project in MySQL 8.0. The talk assumes you'd have working knowledge with C/C++ and algorithms.
44
CodeHackerOrakel <Informatik>CodeServerWort <Informatik>VerdeckungsrechnungSoftwareentwicklerComputeranimation
Produkt <Mathematik>FontBefehl <Informatik>InformationDesign by ContractCodeFunktion <Mathematik>Hinterlegungsverfahren <Kryptologie>EntscheidungstheorieOrakel <Informatik>Plug inKomponente <Software>ProgrammbibliothekArchitektur <Informatik>EntscheidungstheorieEINKAUF <Programm>DatenbankFormale SpracheComputerarchitekturMAPBitFunktionalProjektive EbeneResultanteZahlenbereichQuick-SortSystemaufrufGüte der AnpassungServerParametersystemZusammenhängender GraphBenutzerdefinierte FunktionCoxeter-GruppeSchnittmengeKernel <Informatik>ClientOpen SourceVerdeckungsrechnungInterface <Schaltung>InformationEinfache Genauigkeit
Orakel <Informatik>FehlermeldungFunktion <Mathematik>Dynamisches SystemLokales MinimumLoopParametersystemRohdatenDatenbankDatentypZeichenketteTypentheorieVariableBinärcodeFunktionalInverser LimesMereologieResultanteZahlenbereichSoftwarewartungServerParametersystemFehlermeldungZeiger <Informatik>DistributionenraumSchnittmengeQuellcodeObjekt <Kategorie>DickeMultiplikationsoperatorDienst <Informatik>Generator <Informatik>Prozess <Informatik>SummierbarkeitEigentliche AbbildungSoftwareentwickler
Orakel <Informatik>ProgrammbibliothekPlug inArchitektur <Informatik>TopologieFunktionalQuick-SortAbfrageServerProzess <Informatik>Dämon <Informatik>PunktAuthentifikationProtokoll <Datenverarbeitungssystem>SocketPlug inRechter WinkelInterface <Schaltung>TypentheorieÄhnlichkeitsgeometrieNichtlinearer OperatorDifferenteComputeranimation
Orakel <Informatik>EreignishorizontPlug inCodeServerFolge <Mathematik>NP-hartes ProblemRechenwerkAuthentifikationStichprobeComputerarchitekturBinärcodeBitFunktionalInverser LimesMereologieSystemaufrufVersionsverwaltungServerPunktAuthentifikationInformationsspeicherungFokalpunktTextbausteinObjekt <Kategorie>SymboltabelleProxy ServerPlug inDienst <Informatik>Interface <Schaltung>Computeranimation
Protokoll <Datenverarbeitungssystem>SocketInformationPasswortPhysikalischer EffektFunktionalMereologieZahlenbereichServerParametersystemWurzel <Mathematik>VerdeckungsrechnungSocketPlug inPasswortProgramm/QuellcodeJSON
Protokoll <Datenverarbeitungssystem>InformationStichprobePlug inOrakel <Informatik>AuthentifikationSocketPhysikalisches SystemPasswortQuick-Sortp-BlockPlug inFunktionalAuthentifikationDienst <Informatik>Programm/QuellcodeJSON
GravitationsgesetzAuthentifikationSocketLoginLastServerDeklarative ProgrammierspracheSymboltabellePlug inMultiplikationsoperatorProgramm/QuellcodeJSON
AuthentifikationSocketOrakel <Informatik>StichprobePlug inPolstelleQuellcodePlug inMakrobefehlPeer-to-Peer-NetzQuellcodeElektronische PublikationObjekt <Kategorie>BitDemoszene <Programmierung>Programm/QuellcodeJSON
Orakel <Informatik>Plug inAuthentifikationProgrammverifikationBitAuthentifikationPlug inProgramm/QuellcodeJSON
Plug inInstallation <Informatik>Orakel <Informatik>AuthentifikationProgrammverifikationResultanteServerPasswortPlug inProgramm/QuellcodeJSON
VektorrechnungKomponente <Software>Orakel <Informatik>Architektur <Informatik>DämpfungImplementierungComputerarchitekturTypentheorieBildschirmfensterAnalytische FortsetzungFunktionalInverser LimesLeistung <Physik>Physikalisches SystemZentrische StreckungServerZusammenhängender GraphAuthentifikationKreisflächeFokalpunktPlug inRechter WinkelDienst <Informatik>Interface <Schaltung>SoftwareentwicklerCOMZentralisatorComputeranimation
Orakel <Informatik>VersionsverwaltungKomponente <Software>GleichheitszeichenDienst <Informatik>BinärdatenLokales MinimumCodeImplementierungTypentheorieProgrammverifikationDrucksondierungFunktionalStichprobenumfangServerZusammenhängender GraphZeiger <Informatik>Bridge <Kommunikationstechnik>Arithmetische FolgeMailing-ListeKonfigurationsdatenbankPasswortSondierungMultiplikationsoperatorPlug inRechter WinkelDienst <Informatik>Interface <Schaltung>DefaultMehrrechnersystemComputeranimation
Komponente <Software>Dienst <Informatik>PasswortDienst <Informatik>ValiditätMereologieParametersystemComputeranimation
PasswortZeichenketteKomponente <Software>Dienst <Informatik>MakrobefehlKomplex <Algebra>Zusammenhängender GraphPlug inComputeranimation
Orakel <Informatik>ImplementierungStichprobeDienst <Informatik>VektorrechnungProgrammbibliothekMetadatenOracle CorporationZeichenketteAnalog-Digital-UmsetzerVariableDienst <Informatik>Deskriptive StatistikZusammenhängender GraphWeg <Topologie>Programm/Quellcode
PasswortOracle CorporationVariableProgrammbibliothekOrakel <Informatik>VektorrechnungMetadatenStichprobeKonvexe HülleZeichenketteTypentheorieMetadatenStandardabweichungPlug inDienst <Informatik>Programm/QuellcodeComputeranimation
ZeichenkettePasswortSchnittmengeKonfigurationsdatenbankDienst <Informatik>Zeiger <Informatik>Interface <Schaltung>ValiditätTypentheorieRechter WinkelPasswort
Orakel <Informatik>PasswortDienst <Informatik>Support-Vektor-MaschineZeichenketteSchnittmengeBitKlasse <Mathematik>ParametersystemDienst <Informatik>ComputeranimationJSON
Orakel <Informatik>Dienst <Informatik>MarketinginformationssystemTypentheorieAbgeschlossene MengeCodeWort <Informatik>Web SiteRechter WinkelComputeranimation
StatistikZeitzoneExplosion <Stochastik>SoftwareentwicklerElektronische PublikationSichtenkonzeptSchriftzeichenerkennungQuantenzustandArchitektur <Informatik>BefehlsprozessorAbelsche KategorieServerLoginE-MailVersionsverwaltungData Envelopment AnalysisCodeOrakel <Informatik>InformationVerschlingungDokumentenserverProzess <Informatik>ProgrammfehlerComputeranimation
AuthentifikationPhasenumwandlungServerQuellcodeKlasse <Mathematik>Modul <Datentyp>Web-SeiteElektronische PublikationAggregatzustandProtokoll <Datenverarbeitungssystem>DatenreplikationLoopBefehl <Informatik>Maß <Mathematik>ProgrammInformationsspeicherungFehlermeldungClientATMOrakel <Informatik>Exogene VariableCodeDiagrammFunktionalProtokoll <Datenverarbeitungssystem>Computeranimation
ATMDatenreplikationServerOrakel <Informatik>Plug inVektorrechnungFunktion <Mathematik>MereologiePrimidealBasis <Mathematik>Zusammenhängender GraphQuellcodePlug inDienst <Informatik>SoftwareentwicklerFunktionalXML
Orakel <Informatik>Migration <Informatik>Plug inVektorrechnungFunktion <Mathematik>SystemplattformDienst <Informatik>Desintegration <Mathematik>PunktwolkeRelativitätstheorieRechter WinkelCodeFunktionalFormation <Mathematik>VerdeckungsrechnungPlug inComputeranimationVorlesung/Konferenz
SystemplattformDienst <Informatik>Desintegration <Mathematik>PunktwolkeDeskriptive StatistikProzess <Informatik>Dienst <Informatik>SpeicherabzugMultiplikationsoperatorVorlesung/Konferenz
Videokonferenz
Transkript: Englisch(automatisch erzeugt)
Let's get started, I only have 30 minutes and that can go as long as we need it to. It's a wide subject, really. Okay, so a few words about me first.
I'm a MySQL developer, currently leading a team working on the MySQL server code. I've been a long-time guest to this conference, I really like it and I like the community in OpenCL, so I think it's important that MySQL is present here and we show what we are working on to you guys.
Right, so first of all I would like to say that please don't make any purchasing decisions based on the presentation here.
It's just for your information. Okay, so I want to talk about how you can go about adding functionality to MySQL. In a nutshell, it's a very easy answer.
Basically it's an open source project, you just hack on it and then you get results. But because writing database kernels is not an easy matter, we have come up with ways to make it a bit more simpler than that.
So those are the first three items on my agenda here. It's the basic way, the old way and the new way that we have in MySQL 8.0. And of course the last one is the traditional way to add functionality that you may think of. Right, so we'll start with the basic way.
Typically when you want to extend a MySQL database server really, what you really want to add is another function that you can call from the SQL language. And this is what user-defined functions are actually.
They operate on this basic level. They assume that you pass a certain set of arguments to the function and then you get a single result really. That's what it is. This interface has been around with MySQL for probably 15 years now. So there's a good number of usage examples of this interface.
And because it's that old, it's also very basic. So, well, it works for the basic stuff again, but not for the more advanced things that you may want to do.
That's the architecture of the UDFs. The client executes some sort of an SQL that triggers internally in the server, triggers a UDF call, which stands for user-defined function. And then that calls this external component that is loaded inside the server and that produces some result, which is then sent back to the client.
That's in a nutshell what it is. So pros and cons of that. As I said already, it's very easy, very ancient, very well documented because of that and kind of easily understood by people.
It's also dynamically loadable, so if you have a running server, you can install your extra shared object and then have it be executed by the server process. Cons are it can return a single value. That's the biggest limitation of it really.
And there's a couple of extra things like it cannot generate proper SQL errors. It can just return that value. And also it has kind of a limited data type support. It does not support all the modern types of the database engine.
And another kind of maintenance hurdle is that you need to be defining, if you have like a single shared object with say 10 functions, you need to execute create function 10 times for each of these functions. It's kind of inconvenient.
And here is an example of how one of those functions really looks like. So this particular one takes a variable number of arguments and then does certain things based on the type of the argument really.
Okay, so you can check for nulls there, you can return the set result and yeah, it's pretty basic really. And this is how it looks like when you want to compile it. This is based on the source distribution actually.
So if you have a binary distribution, what we call a binary distribution of MySQL, you may skip some of the include parts here, but well, that's the source and developer, so I work on the source. And that's how you compile this single function. And then this is how you install it as well and run it.
As you can see, it operates as expected. Basically, it will sum the one and the two on the last example and then add to that the length of the string which is three and you get the six. So yeah, kind of nifty.
If you want to run a service inside the server, I run an extra thread or have some sort of a background processing going on. The traditional way of doing that inside MySQL is called the plugin API.
So that has been around with MySQL since 5.0, I believe. There is also a book on the subject written by the architect of this interface, strongly recommended if you consider doing that.
And the premise here is a bit different. So those of you familiar with Apache modules, it's a similar concept here. Basically, you have the SQL query and then the server at certain points decides,
okay, I want to check for pluggable functionality. Like, for example, authenticating users is one such checkpoint. All the authentication in MySQL is done via plugins and it's because the server decides, okay, I now want to go
and instead of authenticating this user account, I want to go and search for plugins that do that. But of course, there is this other type of plugin which we call the daemon plugin. Basically, this only gets initialization and de-initialization when loaded or unloaded from the server.
And you are free to do your own background processing, like open a listening socket or whatever. This is, for example, how our new protocol plugin operates. It's a daemon plugin that listens on 33060
and listens for the new X protocol. Right, but the premise is simple. The server decides when to talk to the plugin and then the plugin can only talk to the server. That's it. Okay, pros and cons. I mentioned already some of those.
This has been around and there's a lot of examples. So you have a good starting point and a good boilerplate to work on. And also, storage engines are plugins of a sort. So it's driving important parts of the server, really.
So if you want to write plugins, there is not a very rough start. I mean, you have a lot of examples. And they are dynamically loadable. You can basically call server functions if you know them
because when you link the plugins, we'll see later, you link them with the server binary symbols. So basically it's a shared object that can call back into the server main binary, all the public symbols of it. Kind of interesting.
So if you lack this particular interface from the server that you need, you may just as well call the internal functions and be done with it. This, of course, has risks because you cannot really have portable plugins. Basically, if you do that, then you need to compile your plugin with every new server version that is out there.
But it's tempting because there's a lot of functionality that you can access. Okay, so cons are the biggest cons of plugins is that the plugins can only talk to the server. So server is like this focal point of plugins. And if you want, say, your storage engine to talk to your auditing plugin,
then you basically need the proxy service in the server that allows other plugins to talk to the auditing plugin. And that's kind of tedious.
I mean, for every new plugin that you add, if you want to enable it for other plugins, you need to do this extra proxy service so that the plugins can call it. Otherwise, it's only the server that can interact with it. That's the biggest architectural limitation, if you like.
Okay, so another thing is that you don't have explicit dependencies because of the plugins talking only to the server. So they expect that your plugin does not depend on any other pluggable functionality,
which is kind of limiting. Or you need to handle the dependencies yourself. Basically check, okay, so does this work? Then, okay, do whatever. Otherwise, just bail out. Right, so this is an example authentication plugin.
Probably the simplest one that we have. There is even a simpler one, but, well, it does not do a lot. It just says yes. So this one does a bit of, well, maybe it's not really well seen. Let's see if we can zoom a bit like that.
Is this better? Okay. So as you can see, this is the main function of the plugin that the server calls when it wants to authenticate user.
There's a number of arguments passed on top of it and then what it does, it goes to the Unix socket and checks the Unix user that's authentic, well, had that socket opened, and then if the user name of that user matches the user name
that is passed to the server, then it says okay. So basically if you are logged in as root, you can be root in MySQL. Or if you are logged in as Joe, you only can be Joe in MySQL. Nothing else.
But the good part is that you don't need the password because, well, it's already been provided to the system. So each plugin also has like a descriptor block of a sort. This is the authentication handler. So as you can see here, it's the authentication function on top.
And then some other service functions that we use for various occasions. I can go over this, but we don't really have the time for that. And then it's the plugin declaration. So this plugin declaration basically defines one public symbol
that the server searches for when it loads the plugin. And from there it all begins basically. Okay, so that's plugins. That's how you write plugins. Plenty of examples again. Go to plugin slash, and you will find at least ten of them
in the original distro. This is how you compile one. Sorry, this is one simple CMake file that you use to compile a plugin. Basically, you call CMake macro again. You give it the name of the plugin, the source files,
and then you say I want this plugin to only be compiled as a shared object. That's what module only is. And this half peer credits some CMake magic to check if you are running on Linux. That's what it is. Okay, and this is how using a plugin looks like.
Again, I'll try to zoom a bit. So you install it first, obviously, right here. And then, well, you create a user using that plugin because it's an authentication plugin.
And now you can basically just log in with it. And there you go. You are authenticated without a password. So that's how these plugins work. The server needs to know to call them and when it does call them, they produce the results they are expected to produce.
So, not very complex, I hope. And there's various kinds of plugins, too. I mean, various plugin types. There's the authentication plugin. There's all of these guys, like six or seven. Okay, so this one is our new take
on the plugin system architecture. We thought it would be easier if we start from scratch because there are certain architectural limitations with plugins that will not be easy to overcome
as a kind of continuous development. So we decided to make a completely new interface based on the idea of components and the kind of microservices that you expose and use. This is new in 8.0, so if you want to play with that,
you need high-scale 8.0. But, well, since it is new functionality, that's natural, I guess. As you can see, the architecture here is quite different. There's no server as a central focal point anymore. There's just a bunch of components
talking to each and every other component through services. That's the premise. I want to make a reference to the Windows com architecture. I know it's not very popular in your circles, but it does have some good ideas there.
And namely, the good ideas is that the interfaces are an abstract thing that may have one or many implementations. That's the basic good idea that I like in common. I try to implement here as well.
So, pros and cons of components, obviously. Each component can only consume services, so you get as a benefit an explicit list of dependencies, basically, you know what you need. But just based on the services that this component is asking for.
Then all components are IKEA and there's no central command, so to say. Every component can talk on an IKEA footing with every other component or implement or consume the services available. So, there's no special place for the server anymore.
It can also override existing implementations. As I said, a service is an abstract entity, so you can have many implementations of that service, but only one is considered the quote-unquote default implementations.
And if your code is asking for the default implementation and not for any specific implementation, then if you switch the default implementation, you can basically override functionality that may be in another component through your own implementation of that service.
So, it's obviously very different. It requires a central registry of all the interfaces and we have that. So, every component, when it is initialized, it gets an interface pointer to the registry. So, you can search for components.
And for services and implementations there. And then components register their implementations of their services into the registry as well. Right. So, some cons. We have a bridge between plugins and components.
So, we have a plugin service that allows you to get to the registry. So, if you have a plugin or you want to modify one, you can start with that and then access the registry through that bridge and then talk components. But, well, not all plugin APIs
are yet converted into component APIs. We plan to do that, but it does take time. And some APIs, some plugin APIs have like 300 methods. That's just not reasonable. So, that will take some work, but we are doing that evolutionally.
We took one plugin type, the password verification, and we converted it into components. So, you can see how this is done, comparing the 5.7 plugin implementation towards the 8.0's component implementation. So, there are examples already, but it's work in progress in a way.
Right. So, that's where we are at right now. I have some samples for you. I'll be quick to leave some time for questions. This is how declaring a service looks like.
So, you define your service by name. You define two methods for it, validate and getStrength. And then you implement the methods. So, this is the bottom part is how implementing one method looks like.
Basically, it takes arguments and then does something and returns whatever the definition of the method is right here. So, not very different from plugins is what I'm trying to show you. We have helper macros to kind of hide some of the complexities.
And it looks similar to plugins. So, if you are kind of comfortable doing plugins, you should also be comfortable doing components, I would guess. Okay. So, that thing before that was a service,
and this is how the description of one component looks like. So, you start by defining the component and what services does it provide. So, there goes your dependency tracking. And then you have a bunch of requirements.
Basically, it does need some services. So, it explicitly states what it needs, and that's really great when you want to load multiple components. You know which one you need to load and is it possible to load that particular component. And, of course, there is the standard metadata type of thing similar to plugins.
So, nothing scary there. I have an example here on how to implement this service, really.
No, sorry, how to use it. So, this is how you basically access the registry. We have the registry pointer, and then we call acquire with the name of the service over here, and we get an interface pointer,
which we basically typecast to the right type, service type of validate password, and then we can just call that, basically. So, that's the hard way. And there is a bit of a better way. We have a helper class for you,
which is called MyService. It does take all of these parameters and does the proper typecasting, so you can just, well, then call the thing. And it also does acquire and release for you when it goes out of scope. So, kind of a convenience thing.
So, a few words on the pull request. I won't go into details here because I'm sure you have encountered this site on the left, right? That's GitHub. So, basically, what I want to show you here is that whenever you do that,
basically file a pull request in GitHub, you automatically, we have some automation here, you automatically end up with that. So, we get a proper notification directly from GitHub with all the necessary information there, with a link to the GitHub repository.
So, we fully integrated pull requests into our own internal processes. So, that should be kind of easy for you. You just do your pull request, and then you communicate either via the GitHub or via the bug API.
Up to you, really. And last but not least, I am really proud because I spent a lot of effort on it. To show you the MySQL Doxygen documentation. So, starting with 8.0,
we are putting significant effort into documenting the code in the proper way. And I have most of the protocol documentation converted and updated to the latest details of the protocol into Doxygen, and there's also a ton of function and API documentation
that will help you reading the code. Also, some nice diagrams, as you can see here, explaining the workflows. So, yes, that's your new place to basically start reading the MySQL code. We intend to put a lot of effort into that
on a continuous basis. That's our prime documentation source nowadays. Even for internally ramping up new developers and stuff like that. So, you get basically the royal treatment if you read that.
Okay. So, what you should... There was a lot of technical stuff here and a lot of source code, but these are the important parts. You should always consider, at least when working with 8, trying to consider a component
if you want to add something. That's the most basic and most robust way that you can use. And you need to go to the plugin if you need some existing API or you need to call into something that's not yet a component service. So, that's when you go to plugins.
And finally, if you just need an SQL function, then you do your DFS. And if nothing else works, there's always the pull request. So, that's my talk in a nutshell. That's how you select. Okay. With that, I have four minutes for questions.
Anybody? Anything. It does not have to be plugin related. Any MySQL questions is okay. One thing I forgot to mention, MySQL has a brand new GA release
called MySQL 8. It has a lot of functionality, as you noticed. Not only under the hood, but also user-facing. So, please go and try it out. It has a lot to offer.
Yes. Right. Okay. So, basically,
we will update the documentation with the description of the services. We are in a process of doing that. So, yes, it's going to be properly documented in Doxygen, all these methods and services.
Yeah. It does go deep, yes. We plan to use it for our own internal purposes as well. Okay. Great.
Any more questions? Anybody? No? Then I thank you for your time and I hope you have a good time using or learning about MySQL. Thank you.