They're functional! They're efficient! They're persistent data structures!
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 34 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/38566 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Bangbangcon (!!CON 2016)9 / 34
3
15
16
17
18
19
24
28
30
32
34
00:00
Functional programmingData structureComputer animation
00:12
Functional programmingData structureRecursionBitData structureLevel (video gaming)JSONXMLComputer animationProgram flowchart
00:42
Functional programmingRecursionData structureStapeldateiAlgebraic closureXML
01:22
Functional programmingComputer programmingNoise (electronics)Functional programmingSound effectOrder (biology)Moment (mathematics)Slide ruleNeuroinformatikCentralizer and normalizerPoint (geometry)Statement (computer science)Computer programmingComputer animation
03:00
Overhead (computing)SpacetimeRevision controlMereologyData structureControl flowRevision controlPoint (geometry)Entire functionArray data structureType theoryLatent heatNetwork topologyRight angleMathematicsCASE <Informatik>Electronic mailing listTrailBitNumberRootSoftware bugOverhead (computing)MereologySpacetimeSequenceMathematical optimizationMultiplication signGreatest elementComplex (psychology)GodShared memoryGroup actionRecursionDifferent (Kate Ryan album)Computer animation
09:40
Functional programmingArray data structureElectronic mailing listAlgebraic closureIdeal (ethics)Hash functionVector graphicsData structureScripting languageReading (process)Computer configurationType theoryNetwork topologyBlogAlgebraic closure1 (number)Array data structureQuicksortSpacetimeFunctional programmingImplementationHash functionLibrary (computing)Programming languageMoment (mathematics)BitPhysical systemIdeal (ethics)Bit rateMathematical optimizationSeries (mathematics)Goodness of fitLink (knot theory)Different (Kate Ryan album)Source codeFacebookMappingComputer animation
11:47
Natural languageCellular automatonComputer animation
12:02
Power (physics)FreewareZipf's lawComputer animationProgram flowchart
Transcript: English(auto-generated)
00:20
Hi everybody.
00:23
Okay, I'm not going to talk into the mic, I'm going to talk normally. Just like I always would on a stage. So hi, I'm Anjana Vakil. And today I'm going to tell you a little bit about persistent data structures, which are a thing that I found out about when I was at the Recurse Center last fall.
00:45
By now you've probably heard of it. Shout out to the Fall 2 batch in the house. Yeah! And when I was there, I found out that there's this thing called functional programming. And one of my batch mates, Sal Becker, who's a functional programming guy, closure guy,
01:02
sat me down one day and was like, you know, if you're interested in functional programming, you should probably find out about these things called persistent data structures and gave me a brief overview of what they are. And I was like, that's so beautiful and simple and brilliant and cool. And so that's why I'm here today, to spread my enthusiasm to all of you, hopefully.
01:22
So functional programming rocks, it's pretty cool. Make some noise if you agree with that statement. Awesome. I'm not going to talk really about what functional programming is or why it's cool. If you're curious, you can talk to all those people that just made noise. So but as we saw earlier in a quote from Mary Rose Cook, one thing that functional programming
01:45
does not have is side effects. So changing things outside of the particular computation or whatever you're doing within a particular function. And so one nice thing that goes along with functional programming is immutability, which also rocks.
02:00
And in order to figure out why immutability is cool, let's talk about rocks, because rocks rock. So this is a drawing of a boulder in Central Park and some people hanging out on it, trying to put a little alt text on my slides here. And I'd like to take this moment to read y'all a poem about rocks.
02:23
Nobody sits like this rock sits. You rock, rock. The rock just sits and is. You show us how to just sit here and that's what we need. It's a poem from the movie I Heart Huckabees, 2004. Existential comedy, pretty cool, check it out.
02:41
But the point is that immutability is like rocks. When we have some immutable data, it just sits and is forever. And that rocks because it allows us to avoid a lot of headaches that having mutable data that changes and we can't depend on, like rocks, can bring about. So when we're working with mutable data, let's say I have some kind of object, like
03:03
a little array of some numbers, 0 to 7 here, I'll give it a name because we like naming things. I'll call it foo because it's boring. And if I want to change something to make this a bit more exciting, like let's say I figure that one could really be fiffed up by changing it to an exclamation point,
03:22
I just change it in the data. And so now my thing that's called foo looks different. And that's convenient if I'm like changing things. But it's inconvenient because I have to manage like keeping track of who changed what about foo, when, and how, and that's a lot of overhead. And then if foo changes in some way that I wasn't expecting, somebody else changes
03:42
it from somewhere, then I could have a bug, and that's no fun, so frowny face. So if we use immutability, then once I create an object like foo, it is the rock. It just sits and is forever. It can never be changed. So if I decide that I want something similar to foo but more exciting with that exclamation
04:04
point in it, I could make, one way I could do it is I could make an entire new object called woo, which has this exclamation point in it. But in all other respects, it looks exactly the same as foo. And that's cool because now if something was depending on foo always being the same,
04:21
it doesn't break, but I still have my new exciting woo version. So if I am, this is one way of handling immutable data, but it's not great because I had to go through and copy everything else in foo that didn't change, the zero, the two, the three, the da da da da, and that takes up a lot of time and a lot of space, right?
04:40
So I have to go through everything in my array, I have to make an entire new copy, and it's like, ugh. And as soon as you get long things or big complicated objects or many, many copies, you run into issues. So there must be a better way of using immutable data, which rocks, as we saw. And spoiler alert, there is persistent data structures.
05:03
So the basic idea of persistent data structures is just that the old versions never change. Like as we saw in the really simple copying example, the old versions of an object just sit and are like rocks. And that's cool, but ideally in practice when we talk about persistent data structures, we're trying to do this in a way that's more efficient when I want to update something
05:23
or change something about the old version. And so we want to have that efficiency. And so the idea is to do magic. Huh? No. We don't have to make prayers to the gods of time and space complexity. We can actually use a really, really simple idea, which is to reuse unchanged parts
05:45
of our old versions of the data structure. So one really simple thing we could do is instead of using an array itself, we could make it a linked list. So now I have one pointing to two, sorry, zero pointing to one, one pointing to two, et cetera. And that's my foo.
06:00
And if I want to change the one to an exclamation point, I can make that exclamation point as a new node and then have that point to the old node of two. So the exclamation comes before two. But the problem is I'd have to also copy everything that came before the exclamation point. So I'd have to make a new copy of zero, point that to the exclamation point.
06:20
And so now I have my woo object, which is like a partial copy of my old list. And that's cool because we're saving everything after the list being copied because those numbers two, three, et cetera, we can reuse the old nodes. But if I wanted to change the seven to an exclamation point, I would still have to copy everything before it.
06:40
So in the worst case, it's like the same. So we could probably do better. And we can. Let's talk about trees. So this is a drawing of some trees. And in particular, when you hear about persistent data structures and trees, you usually hear about tries, which is like a specific type of try, but that's a will, actually. I just will, actually, myself, so who cares?
07:02
Anyway, so the basic idea is that now instead of having our array, we could split it up into nodes that have just parts. And I could split it up individually so that I have zero in a node and one in a node, but they might be lonely. So let's put them with a buddy. So we could have two things in each node, or we could have eight or 32 or however many
07:22
you want. But the idea is that we have these leaf nodes that contain our values, and then we add intermediate nodes in the tree that kind of pair up those leaf nodes or put them in groups of 32 or whatever it is. And then we go up and up until we have some kind of root node that unites all of our little nodes and gives us the tree that we previously were calling the array foo.
07:44
So now foo is our tree. And if I want to change something, all I have to change is one of those nodes at the bottom, not all of them, not a whole sequence, and then I can copy over that node and all of the other nodes between it and the root of the tree.
08:02
So everything on that path, it's called path copying, unsurprisingly. It works like this. We copy over that zero one node, and we change it to a zero exclamation point. And then we copy over any of the little intermediate nodes that were previously linking to the rejected boring one node, and ultimately we get a new root, which is our woo.
08:24
And so here now we are reusing a lot of the old structure of our foo tree, and that's called sharing structure or structural sharing. So here in yellow, we can see that all of the leaf nodes containing two, three,
08:41
four, five, six, seven, and then any intermediate nodes that weren't affected by us creating our new zero exclamation point node. Those are shared between the old version foo and the new version woo. And so in this case, this is better than our linked list example earlier, because if I wanted to change the seven to an exclamation point, I don't have to copy over all the other numbers. I can just copy that one node and reuse everything on the other side of the tree.
09:04
And this is, we're talking about updates here, but similarly it works for, you know, like adding something or popping something off, et cetera. And there's a whole bunch of optimizations you could do to make this like really, really, really fast and great. But this is the basic idea is just sharing the old data as much as possible.
09:21
And by formulating it as a tree like this, we can reuse a lot more of the structure than with other solutions. So I thought this was pretty cool when I found out about this at a recurse center. And I was like, yeah, let me do it in JavaScript. Okay, that's complicated. But it's not.
09:41
And I'm going to tell you in a moment what we can do to use these in JavaScript. But first, brief recap, mutability, cool, copying things over, not cool, because it wastes time and space, sharing old data, awesome. So if you want to do this kind of thing in a language like JavaScript, as I did, you could use a couple of libraries that allow you to just plug and play basically
10:04
these type of persistent data structures. So one is Mori, which is a closure script port by David Nolan. It's pretty cool because it has a totally functional API. You can check out the link there for its GitHub, which has really good docs and everything.
10:20
There's also ImmutableJS, which was a JavaScript library created by Facebook. And that has a bit less of a functional API. It's got like methods on objects, so it sort of seems like you're changing something, but you're not. You're really not. You're creating new ones. But it's a bit smaller than Mori, so also a valid option. Anyway, and if you use a different language than JavaScript, Google it.
10:41
There's probably a library. For example, there's PySystems for Python, because puns are great. And if you don't want to use libraries, you could also try using a functional language. Like Clojure, for example, which has these things built right in. And so what I've been talking about, these trees or tries, whatever, well actually, is kind of inspired by the way that they're implemented in Clojure,
11:04
which Rich Hickey designed the implementation of those data structures like vectors and hash maps and whatever, based on a paper by Phil Bagwell called Ideal Hash Trees, which is some good further reading on this topic. If you're intrigued by this concept, as I was, and you want to learn more,
11:23
you could check out that paper, go right to the source and read Phil Bagwell's paper and associated articles. Or I highly recommend this series of blog posts by Jean-Nicolas Laurent, called Understanding Clojure's Persistent Vectors, that explains a lot more of the optimizations and stuff that you could do here. And yeah, basically, persistent data structures are really exciting.
11:42
And if you want to get into functional programming, you should use them, because immutability rocks. So, that's it. Thanks very much. I'm Anjana Vakil, tweet me at Anjana Vakil. And thanks to all the RC alums that helped me with this talk and taught me about this in the first place, like Sal Becker. And thanks to Beg-Beng Khan for having me. Congratulations.
Recommendations
Series of 9 media