PL/Parrot: Cutting Edge Free Software
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 64 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/45945 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 20114 / 64
3
7
10
11
17
19
21
28
33
34
35
37
40
44
48
49
52
55
57
59
62
63
64
00:00
Open sourceElectronic data interchangeMoment (mathematics)Computer animationXMLLecture/Conference
01:03
Open setMedical imaging
01:37
Relational databaseSoftware testingCodeGraph (mathematics)Right angleQuicksortMultiplication signProcedural programmingSoftware developerDependent and independent variablesSoftwareDifferent (Kate Ryan album)Set (mathematics)Latent heatTap (transformer)Einbettung <Mathematik>Customer relationship managementCommunications protocolSpacetimePhysical systemDisk read-and-write headInheritance (object-oriented programming)Information technology consultingExpert systemComputer animation
05:09
Virtual realityVirtual machineSymbolic dynamicsDatabaseStandard deviationCustomer relationship managementRapid PrototypingDatabaseScripting languageFormal languageFilm editingSoftware testingProcess (computing)PrototypeQuicksortCartesian coordinate systemEinbettung <Mathematik>Graph (mathematics)Multiplication signProduct (business)MereologyComputer architecturePhysical systemProjective planeLevel (video gaming)ResultantCodeTheoryVirtual machineInheritance (object-oriented programming)Data storage deviceDynamical systemProcedural programmingMoment (mathematics)Computer animationLecture/Conference
09:40
Point cloudCustomer relationship managementData storage devicePoint cloudNeuroinformatikQuicksortImplementationGroup actionDatabaseBitHeegaard splittingException handlingProcedural programmingEndliche ModelltheorieSoftwareComputer animationLecture/Conference
11:34
Virtual machineCodeLine (geometry)WritingRight angleInsertion lossCASE <Informatik>Multiplication signLevel (video gaming)Computer programmingOperator (mathematics)BitAssembly languageQuicksortExtension (kinesiology)Intermediate languageInheritance (object-oriented programming)Physical systemComputer fileFormal languageImplementation2 (number)Computer animationLecture/Conference
14:58
Embedded systemFluxHigh-level programming languageOpen sourceDifferent (Kate Ryan album)Projective planeFluxSoftwareHigh-level programming languageFreewareProcedural programmingQuicksortInheritance (object-oriented programming)CodeTrailEinbettung <Mathematik>Assembly languageBitProcess (computing)Software bugGraph (mathematics)Closed setComputer animationLecture/Conference
18:02
Type theoryZugriffskontrolleInstallable File SystemComputer configurationGUI widgetControl flowSoftware testingTap (transformer)Game controllerContext awarenessOnline helpQuicksortType theoryFormal languageProjective planePower (physics)Group actionPhysical systemInheritance (object-oriented programming)SoftwareMereologyInsertion lossImplementationBefehlsprozessorSoftware testingDifferent (Kate Ryan album)File systemDoubling the cubeMultiplication signKeyboard shortcut1 (number)Electronic mailing listCodeData typeDeciphermentFront and back endsDirection (geometry)Relational databaseString (computer science)BitResultantComputer clusterPoint (geometry)Mechanism designCustomer relationship managementoutputRight angleFunctional (mathematics)Moment (mathematics)PLS (file format)Open sourceFunction (mathematics)Configuration managementComputer animationLecture/Conference
25:18
Mountain passData typeSoftware developerOnline helpDivisorThermal expansionDisintegrationInformationTime zoneSource codeMultiplication signINTEGRALGame controllerPLS (file format)Data structureBuildingPhysical lawOnline helpPoint (geometry)1 (number)Software testingPhysical systemQuicksortQuiltType theoryInheritance (object-oriented programming)Sign (mathematics)Formal grammarPatch (Unix)Graph (mathematics)Revision controlInformationGroup actionBitCovering spaceParameter (computer programming)DatabaseSocial classSpeciesSpecial unitary groupMethodenbankPhysicalismConfluence (abstract rewriting)Data typeSoftware developerFormal language2 (number)Computer animationLecture/Conference
32:27
Computer animationLecture/ConferenceXML
Transcript: English(auto-generated)
00:08
Just a quick plug for my employer who is very kindly paying me to be here today. I work at VMware. We're doing very, very cool stuff.
00:20
Anybody that wants to learn about this cool stuff that we're doing before August, please come talk to me afterwards and maybe we can see about getting you on our team. What I'm working on at VMware is not this.
00:40
It's not even closely related to this, but it's still pretty cool stuff. Okay, so, one moment.
01:00
So, I'd like to talk about this. This is, I don't know if you can see it, there's an elephant there riding in on a parrot. And the woman who is now my wife, actually as of last Saturday,
01:22
she made this image for me and I really appreciate it. Thank you, Sarah. Okay, so, we have here a blue elephant that's Postgres-ish and a parrot, and we're trying to bring them together.
01:40
Normally, you don't think of having bleeding edge features on a relational database management system. It's just not how we normally think of these things. We think of them as sort of stodgy and old-fashioned and maybe outmoded,
02:08
but this is Postgres. This is not stodgy and old-fashioned. And so, we do have these cutting edge features, and I'd like to start by thanking the people to blame for this new feature.
02:23
That guy in the middle, Jonathan Leto, he's been blazing a path through the parrot space, and he was the person that turned my wacky idea of embedding parrot in PostgreSQL
02:41
into actual code that actually executes. So, this is this really important difference between the visionary, that's me, and actually having something happen. So, he's kind of the person that we really owe the credit or the blame,
03:06
whichever way you want to look at it. This guy, Josh Tolley, anybody familiar with him? No? Okay, so he is the person responsible for embedding LoL code inside Postgres.
03:22
You can now run stored procedures in LoL code inside Postgres. Is anybody familiar with LoL code? I see a few hands. Has anybody hung over? Nobody's copping to it. Okay.
03:44
The guy on the right is David Wheeler, my colleague at PostgreSQL Experts, a consulting company in which I hold a stake. He did the PG-TAP tests, which should be familiar to those of you that write Perl.
04:02
Is anybody familiar with TAP? A few hands. So, TAP is the Test Anything Protocol. Turns out it's not Perl specific and it's handy for lots and lots of different, let's say, test-centered methods of development.
04:24
I really wouldn't call any method of development test-driven. See, tests don't up and come up with a set of requirements and start writing software. So, tests don't drive development and never will. But you can have tests and they're handy to use when you're developing software
04:45
and you should never omit them. Down there in the middle, that sort of dark picture is of me. I'd also like to thank Daniel Arbelo Arrocha, whose picture I was not able to get in time for this talk.
05:09
Okay, so Herit is a virtual machine for dynamic languages. It's register-based and it's really, really cool.
05:25
It's pluggable, it's interoperable, it's dynamic. PostgreSQL is, strangely enough, the world's most pluggable database. Anybody know why that is? Yes?
05:51
So, the answer was you can incorporate Perl scripts and PostgreSQL stored procedures, which is true. But what I had in mind was the fact that Postgres hung out
06:02
at the University of California at Berkeley for many years, being the host of people's PhD theses. So, it was a place where you could sort of test out your new ideas
06:20
at every level of a database management system, however strange. And so, the result of this was an architecture which is pluggable at every level, at least in theory. There's a few parts that have been sort of welded together in the interest of performance, or just out of the disinterest in actually plugging something in there.
06:44
But that's how it got to be so pluggable. It's also extremely standards compliant. As far as I know, no database management system in production is more compliant with SQL 2008 than Postgres,
07:03
and that includes DB2, DB2 being sort of the place where SQL really got started. And, of course, PostgreSQL is why I'm here, because if it weren't for Postgres, none of you guys would have heard of me,
07:23
and as it is, only some of you have. Okay, so why are we doing this? Well, when you're going to embed a language in Postgres, the process is sort of, it's a pain in the neck.
07:48
Really, really serious pain in the neck. There's a lot of repetitive work that goes on. There's a lot of sort of cut and paste duplicate code that goes on. And just generally, as a lazy person,
08:03
it offends my sensibilities that all this has to happen every time someone wants to plug a language into Postgres. The process should not be so arduous. One goal I'm aiming for with this project is for you to come up with a language design
08:23
that you're interested in in the morning. Well, okay, so maybe you took longer than that to design the language, but you choose it in the morning, you look at it over the course of the day, you plug it into the parrot system,
08:42
and by the time you log out for the evening, it's done. That's what I want. Okay, so this is, in order to get this, we're making a PL toolkit. That's what this, that's what PL parrot is intended to be.
09:05
It's probably not going to be as fast as, say, like a PL LLVM, you could imagine such an embedding. And maybe that's actually a better idea for the high-performance kind of applications.
09:24
But it's meant to be able to do rapid prototyping in the real sense of actually being able to knock out a prototype quickly and have it work. So, for just a moment, I'd like to digress into something a little philosophical here.
09:45
What we're actually doing is what I call the anti-cloud. So, in the cloud model, what you have is you disperse your data
10:02
and you disperse the computational resources that you apply to the data, all sort of around a network. What's happening here is what we used to call active databases.
10:20
This was one of the great debates of database management back in the 70s and 80s. And on the passive side, you basically had the idea that the data store was to do nothing except store data. And on the active side, you had this idea
10:41
that you were going to have this enormous wonk of data and then right close to it, you were going to put sort of computational resources and actions and things that went on with it. So, one of the first outcomes of this was a performant foreign key implementation.
11:04
Another thing that came out of the active database idea was triggers, which we now take for granted as being something that you can do. Stored procedures was another thing that we now take for granted that you can just do in databases.
11:21
Nobody really thinks of that as revolutionary or innovative, but it did come from this sort of philosophical split. Okay, one more little bit of philosophy. This one is a little gem from the Ruby community.
11:42
Anybody want to guess what that is? Don't repeat yourself. Don't repeat yourself. I just can't emphasize enough how much of a pain in the rump it is to make a PL in Postgres right now.
12:04
The documents themselves say something along the lines of insert several thousand lines of code right here and then you'll have a new PL. Well, that's not the right answer.
12:22
Another thing we'd like to be able to do is write things in PL Perl 6 and then call them from PL Python. Does anybody notice anything strange about that second language? No? Okay, it doesn't have a capital U at the end of it
12:43
because right now the embeddable Python implementation is one that can only be called by the Postgres super user or it can only be written by the Postgres super user and this is because the Python that we're using
13:02
to embed in Postgres can't be limited in its extent. It can't be prevented from opening file handles or pipes or other things as the Postgres system user which let's say sometimes you want a little bit more sandbox
13:26
than that provides. So, what we'd like to do is sandbox the virtual machine in which all these operations occur
13:40
and then pretty much by magic we get a PL Python trusted. Okay, so how do we get to there from here? First things first, we're going to do a PL parrot intermediate representation.
14:08
Anybody familiar with PIR? Okay, that's one more than usual. PIR is sort of the assembly language for the parrot virtual machine
14:24
only unlike an actual assembly language you can do some pretty high level stuff in it and it's really cool. It's actually kind of fun to program in. It brings you back to the time when you may have
14:45
gotten really close to the machine. In this case you're getting close to the virtual machine and the virtual machine does a little more so being close to it lets you do powerful stuff. Okay, so that was the first thing. Whoops!
15:01
Alright, that's our first challenge is embedding parrot. Right now the parrot embedding is a little bit in flux. You need Parrot Master and PostgreSQL Master.
15:25
Well, not so much anymore. It'll actually run on PostgreSQL 9.0. So that's actually a good thing, but if you really want some of the more cutting edge features we'll probably be tracking Parrot Master and PostgreSQL Master.
15:44
This is one of the very strange things about working in open source or free software. Yeah, I know they're not exactly the same thing but close enough. Is that you get to work with sort of cutting edge projects
16:01
and this has upsides and downsides. I'll talk about the upsides right now. In the process of getting this embedding done we found three or four bugs in Postgres and fixed them and got them into the Postgres code. So that's the upside.
16:22
The projects can sort of play off each other and improve each other. There are downsides. I'll talk about those later. Okay, so that was the embedding thing. We did manage to get PIR embedded. The next thing to do.
16:41
So now there's a sort of an assembly language like thing that you can use to write stored procedures in. Well, okay, as cool as PIR is most people will not want to write code in it so we need a high level language. Some of the high level languages that run on top of Parrot right now
17:03
would be TCL, Python, Ruby. There's a project that's supposed to be doing Perl 5. There's Perl 6. That was what Parrot was originally designed for. So that's some of the HLLs that we should be able to get in relatively simply.
17:30
The HLL API is a little bit rough but we do actually have well, actually it's PL Rockido, not PL Perl 6. Does anybody here care about the difference?
17:44
No. Okay. So we have HLLs. The next thing we need to think about is how to marshal data in and out of Postgres. Now this is a little bit more complicated than it seems
18:09
because PostgreSQL has this amazing type system. I know of no type system in relational database management that comes close to the flexibility and the power of Postgres
18:23
but what that means is that there isn't just a list of types that you have to support if you're really going to have Postgres support. You have to have some way of creating new types in your language binding.
18:41
So as of now, we support basic data types so text and the numeric ones. We're working on the time types and we need something to fall through to which is the byte A or sort of blob of bits
19:04
which has to be the fallback mechanism for all of these types. It's at this point that I would like to beg for your help. If you're interested in this sort of thing, this is a great place to get started on pl-parent.
19:23
Okay, that's the marshaling thing. Well, I've gone way too long here already without showing you some actual code so I'm going to do that right now. This is pl-pir. So what we do is
19:43
we create this function. It has a name. It has a type input. Whoops, get back there. Okay, so it has an input type and an output type. It has a language.
20:03
Then we say as and then we transition from one language to another language. This is one of the things that I think is really, really interesting about parent is this idea of formalizing the transition from one language to a different language.
20:23
Anybody that's written a interpolated double quote has sort of done the same thing. You're going from one sort of language context and that double quote mark is telling you that you're moving to a different language context.
20:45
But it's so small and so simple that it's sort of easy to miss that this is what's going on. Whereas parent has sort of made this explicit and then I think there's some interesting possibilities
21:02
as to what kinds of things one could do with this into the future. Anyway, I'll talk a little bit more about that later. So we have .brand numx add5.return x.
21:21
Not super exciting code, but you could sort of imagine what's going to be the result there. Here's another one that just handles strings directly. Now, what I dimly recall of assembly, which is what this really is,
21:42
didn't really have a concept of strings per se. Maybe I wasn't quite thinking of it the right way. Okay, so that's PLPR. Now, there's not too many small children in the room,
22:03
so that's good. I've got a little scary thing here. Anybody decipher that? That is Perl 6.
22:22
It's really important to comment code that looks like this if you absolutely have to write it. Don't do what I've just done. Okay, so that's the languages we have. That's sort of what they look like in practice.
22:42
At the moment, PLPRL6 is a little bit broken because the Parrot project was running at full speed and the Rocadoo project was running at full speed and they weren't quite synced up together. And so if you try to create this function,
23:00
that'll work fine right now. But if you try to execute it, it will peg the CPU and eventually crash your back end of Postgres. You won't lose data or anything, but it's kind of annoying. And that's one of the things that I would like to get your help fixing.
23:25
Another thing we need to work on is access control. So we have some idea of what sort of controls we want to put on access and we have at least the ability
23:41
to deny direct access to the file system. So at least that kind of attack is not launchable through a trusted PLParrot implementation. We would like configurable controls, some more control over the network,
24:02
so opening up network pipes, and of course some sorts of tests for this kind of access control. We have, as a project, actual PLs, tap tests from PGTap, a Git repo, an issue tracker.
24:27
Maybe someday Postgres will have one of these. And a Freenode IRC channel. Thanks, Freenode. We also have packages for Debian and Fedora,
24:43
and I don't know what other OSes, but there are packages, and so you can just use the packages and that's it. We also have lots of enthusiasm. This is a very important part of any ambitious project.
25:01
If you're bored with it, there's no way it can ever succeed. So here's what we plan to do soon. We plan to sort of get things back working again,
25:20
some better argument passing, some more data type marshaling, just sort of cover a few more of the built-in types, and lots and lots more tests. What we'd like, some better sandboxing than we have now. It's a little bit ad hoc, and ad hoc sandboxes are great for playing in,
25:43
but they're not so good for access control. More HLLs, if you have a favorite language that you want to see in Postgres, and just kind of want to make a name for yourself, this is your opportunity, and that of course leads us to more developers,
26:02
and of course users, because unless the thing is out in the world, it's kind of a toy. Let's see here. How you can help now?
26:20
Well, I like to put in something really concrete, and I found one just this morning while I was reading over the source code again. Basically there is a thing in the source tree that's called plparrot.c. It contains two languages which kind of have to be loaded together,
26:42
and that's sort of a modularity violation that would not be hard to fix. It would give you sort of an entry point into the source code of plparrot.
27:00
Another thing we could use some help with right now is ensuring through the Postgres dependency system that parrot-based PLs explicitly depend on PLPIR and cannot be loaded without it.
27:22
Then I'd like to see about expanding PLPIR's scope so that when you start to build things on top of it, you're not finding yourself needing to build extra features in order to get it to actually work with the database. And, of course, you can go to GitHub
27:42
and take a look at our issue tracker. All righty. So, into the future. Now, this is where we begin to go into things that are a little speculative. I'd like to see about tighter parrot integration in Postgres.
28:04
There's some little licensing issue, but I think we could get over that. But one of the things that Postgres has is an amazing SQL engine. It's so amazing that we actually managed
28:20
to break Bison with it at one point because it was too big, right? So we actually, for a while there, we actually had to use a patched version of Bison in order to build the grammar.
28:41
I think this is maybe a sign that YACC is starting to maybe be a little too small for what we're doing. And I'd like to see a way to make the SQL handling
29:00
done through parrot because parrot is really built for this sort of thing. You have a large grammar. You can pretty much get an executable out of it very quickly with parrot. Another thing that I would like to do is sort of have this transition between SQL and PLs
29:20
and among PLs. This is the kind of thing that I think parrot would actually be very good at as far as handling that transition in and out and cleaning up after itself and doing all those fun things. And then, of course, there's the all-important stuff you create
29:42
because I can't think of everything and the more people we have contributing and creating and criticizing and just generally making the thing better, the better it'll be. You can find more info at pl.parrot.org
30:01
and it's at this point that I'd like to open up the floor for more questions. Yes. Hang on a second. Let's get the mic.
30:24
It's on? It is. You need people to work on daytime support. I try to do that around Postgres. Can it be done with SQL's insane time zone system? Can it be done with SQL's insane time zone system?
30:43
Well, you know, calendaring systems as a class are some of the hardest things that we attempt as a species. If you don't believe this, have a look at some of the calendaring systems that we've built like Stonehenge.
31:06
These are things that are not trivial and I don't think it's SQL's time zone system that makes it so. What happens is that we have the confluence of physical phenomena like orbital times of various astronomical bodies
31:27
like the Earth and the Moon and the Sun, possibly some other ones, and then we have this thing laid on top of it which is a crazy quilt structure of law and custom,
31:41
not all of which is meant to fit together at all. So, the question of, you know, handling dates and times, I think if it were to handle things the way Postgres does, it would be sane enough for practical purposes.
32:02
Next. Alrighty. Well, I'll be around for the rest of today if you have questions, comments, brickbats, and I'd like to thank you very much.