Sharing memories of Python and Rust
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/47419 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
FOSDEM 2020283 / 490
4
7
9
10
14
15
16
25
26
29
31
33
34
35
37
40
41
42
43
45
46
47
50
51
52
53
54
58
60
64
65
66
67
70
71
72
74
75
76
77
78
82
83
84
86
89
90
93
94
95
96
98
100
101
105
106
109
110
116
118
123
124
130
135
137
141
142
144
146
151
154
157
159
164
166
167
169
172
174
178
182
184
185
186
187
189
190
191
192
193
194
195
200
202
203
204
205
206
207
208
211
212
214
218
222
225
228
230
232
233
235
236
240
242
244
249
250
251
253
254
258
261
262
266
267
268
271
273
274
275
278
280
281
282
283
284
285
286
288
289
290
291
293
295
296
297
298
301
302
303
305
306
307
310
311
315
317
318
319
328
333
350
353
354
356
359
360
361
370
372
373
374
375
379
380
381
383
385
386
387
388
391
393
394
395
397
398
399
401
409
410
411
414
420
421
422
423
424
425
427
429
430
434
438
439
444
449
450
454
457
458
459
460
461
464
465
466
468
469
470
471
472
480
484
486
487
489
490
00:00
Semiconductor memoryRoundness (object)Point (geometry)Projective planeCodeComputer animation
00:40
Extension (kinesiology)Physical systemLine (geometry)Computer fileRevision controlPower (physics)Multiplication signTorusExtension (kinesiology)FacebookComputer fileSoftware developerRevision controlPhysical systemElectric generatorLine (geometry)Kernel (computing)Repository (publishing)Computer animation
01:18
Default (computer science)Block (periodic table)CodeLine (geometry)Noise (electronics)AlgorithmMultiplication signFile formatSoftware frameworkFocus (optics)Software testingDefault (computer science)Computer animation
02:08
Software developerSubsetMultiplication signBitPlanningImplementationComputer animation
03:05
Module (mathematics)Function (mathematics)Social classModule (mathematics)Data structureTable (information)Social classLevel (video gaming)Functional (mathematics)Endliche ModelltheorieComputer animation
03:45
Data structureExtension (kinesiology)Data structureProjective planeOpen sourceCore dumpGoodness of fitExtension (kinesiology)CodeInheritance (object-oriented programming)Front and back endsLibrary (computing)Computer animation
04:49
Function (mathematics)FrictionComplex (psychology)Interface (computing)CodeStatisticsComputer fileComputer programmingImplementationDivisorFrictionInheritance (object-oriented programming)Software developerCore dumpOrder of magnitudeFormal languageProduct (business)Computer fileConstraint (mathematics)Object (grammar)Interface (computing)ResultantLaptopSemiconductor memory2 (number)CodeComplex (psychology)NP-hardKernel (computing)Computer animation
07:03
Computer filePointer (computer programming)Functional (mathematics)Standard deviationAbstractionLibrary (computing)Content (media)Extension (kinesiology)Touch typingComputer animation
07:44
Inheritance (object-oriented programming)Social classCompilerProof theoryRevision controlCore dumpUsabilityProof theoryAttribute grammarInstance (computer science)Social classCategory of beingCompilerMultiplication signSystem callInheritance (object-oriented programming)IterationRun time (program lifecycle phase)Game controllerInvariant (mathematics)CodeBoilerplate (text)Data structureSemiconductor memoryMacro (computer science)Extension (kinesiology)Rule of inferenceValidity (statistics)Type theorySet (mathematics)Object (grammar)Physical systemPattern languageBitInteractive television1 (number)Different (Kate Ryan album)Formal languageStreaming mediaElectric generatorPresentation of a groupTheoryProcess (computing)QuicksortVideo gameMereologyCompilation albumNichtlineares GleichungssystemIntegrated development environmentTrailComputer animation
12:30
Extension (kinesiology)Context awarenessInvariant (mathematics)Point (geometry)Computer animation
13:20
Social classBoilerplate (text)MultiplicationCategory of beingAreaFlow separationString (computer science)MereologySet (mathematics)IterationComputer animation
14:03
Computer fileRevision control2 (number)CASE <Informatik>Physical systemRepository (publishing)Computer animation
15:21
Parallel portResource allocationConditional probabilityEinbettung <Mathematik>Formal languageData structureAbstractionType theoryData structureSheaf (mathematics)Formal languageSoftware testingNumberProgrammschleifeExtension (kinesiology)INTEGRALCartesian coordinate systemLoop (music)Single-precision floating-point formatLevel (video gaming)Perspective (visual)Physical systemRevision controlCopyright infringementPattern languagePoint (geometry)Multiplication signCASE <Informatik>Resource allocationSemiconductor memoryPlanningOperator (mathematics)Bit rateBitType theoryResultantCodeConstraint (mathematics)Grass (card game)Different (Kate Ryan album)Library (computing)CodeOrder (biology)Computer fileParallel portCrash (computing)Mobile appCondition numberVideo gameWritingSoftware bugInvariant (mathematics)ImplementationEmbedded systemDistribution (mathematics)Goodness of fitSuite (music)AbstractionCodierung <Programmierung>Standard deviationGame theoryGraph (mathematics)Field (computer science)DatabaseComputer animation
22:33
Extension (kinesiology)Point (geometry)Right angleFigurate numberImplementationRepository (publishing)Server (computing)FacebookPhysical systemCore dumpVirtual machineSpecial unitary groupGoodness of fitMultiplication signInternet service providerEntire functionSoftware developerProfil (magazine)Type theoryComputer animation
25:33
Point cloudOpen source
Transcript: English(auto-generated)
00:05
Our next speaker is Rafael, he's going to introduce himself and his topic for now, so a big round of applause. Thank you all for coming. My name is Rafael Gomez. I work at Octobus.
00:21
We're a small consulting company specialized on Mercurial. I'm going to talk about how we use Python and Rust together in this big, old project. When I say old, it's like an old code base, and what were the pain points and how we fixed some of them.
00:40
As a recap for people who don't know what Mercurial is, there are a few of them. It's a version control system, same generation as Git. It actually was made in the same month of April 2005 by a kernel developer. It's written mostly in Python. It has a decent chunk of C extensions, mainly for speed.
01:01
It handles huge repositories for companies like Facebook and Mozilla, for example, with millions of files and revisions. It has a very interesting and powerful extension system, which I have very little time to talk about. Maybe I'll have a few snippets. It's very cool. Check it out on your own. We're here to talk about Rust and why we chose Rust for Mercurial.
01:22
We just said that we have 40,000 lines of C code in Mercurial, so why switch to Rust, why move? Most of you know what Rust is and why it's pretty good, so I'll keep this short. Basically, it has a better signal-to-noise ratio for us. You have fewer lines that are completely orthogonal to what you're trying to do,
01:43
and you get in focus on actually making the algorithm that you need to fix the issue. The compile time guarantees are interesting for VCS. Cargo and the formatting and the testing framework are all very nice, and the save-by-default aspect is very reassuring, but I think that's no news for any of you,
02:06
so I'm going to skip right into the performance aspect. There was an experiment by Valentin Guetjian-Baron, who's a developer at Jane Street.
02:21
He built a very small subset of the status command in pure Rust. It's not complete. It does just a little bit of what status does, but it was good enough for their purposes. As you can see, the performance is just miles better from the Python and C implementation.
02:41
That sparked a lot of interest in the community, and although the plan to introduce Rust within Mercurial was already put in place maybe a year before that, it really was around this time that we really started to put it to use and to really dig into it.
03:03
There are many different ways of connecting Python and Rust together. I'm not going to go into detail as to why exactly we chose Rust CPython. It compiles on Rust table. There you go. It's composed of two crates, the first one being a very low-level crate that just binds to the CPython ABI
03:22
and does all of the tedious work that you really don't want to be doing by hand, and a higher-level crate that is more functional in the sense of exposing a module to Python that just looks like a Python module from Rust, creating classes, functions, that kind of stuff. You have an eval function, which is pretty useful sometimes.
03:44
That gives the following structure. For people who are not using CPython, I'm sorry, but there you go. Pure Python code, of course, talks to its backend. The C extensions also talk to CPython. From the Rust side, we chose a... Well, I should say, just before I came to the project,
04:05
this structure was chosen of using two separate crates. This one, hjcore, is a self-contained Mercurial library with no idea whatsoever that there's a Python somewhere. The idea is to have it self-contained, and it should work on its own.
04:24
AGCPython is one of the crates that we have, which is way more developed than the others because it's the more common one, but it's one of the possible crates that you could have to bridge the Python code and the pure Rust code.
04:40
I was very excited to start working on this because you get paid to write Rust, so that's super cool, and open-source Rust. But it was not super convincing at first. The first non-trivial program that I tried to write was about twice as slow as the reference implementation, even though it was written in Rust, and it was pretty sad.
05:03
This is due to a couple of factors, the main one being friction in general. If you're trying to bridge two languages together, you always have friction at the FFI layer. You have a border within the two languages, and you have to interface them.
05:21
They don't work the same way, especially Python and Rust have very different ways of handling memory and thinking about ownership and that kind of stuff. So you pay two prices. You pay the cognitive price, the developer price, I could say, of the complex interface code that you have to write that is basically not... It feels useless because you know that you're just trying to exchange data,
05:44
and that's it, but you have to still write a lot of code to do this. It's not the main thing that you're trying to do, but you still have to do it. It's an engineering constraint, and it's what you have to do. But it's complex, so it takes away some of your budget.
06:01
You have to think about all of those things. The fact of the matter is, exchanging data in general is costly. Just having to allocate memory at all, moving memory around, looping on objects, and just in general having the gill, for example. You cannot do those things in parallel because Python has the gill,
06:21
so if you're trying to create an object in Python, you can do some of the stuff outside of the gill, but at the end of the day, if you're trying to communicate with Python, you still have to have the gill. Global interpreter, for some people who don't know. I have an example. On my laptop, if I try to start 100,000 files in Rust in parallel with hard kernel caches,
06:44
I get about 30 milliseconds of wall time, which is pretty cool, and then I try to give those results to the Python layer in any meaningful way, and I get an order of magnitude more on top of what I was trying to do, completely negating the usefulness of doing it in Rust in the first place.
07:01
That's pretty frustrating. We have many possible solutions. One of them is to communicate with C directly, the C layer. There's a thing in the Python standard library called capsules that allow you to share an API of function pointers between C extensions,
07:23
and you can just target the C API with Rust and use capsules to communicate with the C layer of Python directly, which is pretty cool. You can exchange less data. In general, move up an abstraction layer and maybe give the file name instead of the file contents and that kind of stuff.
07:40
In general, just do more in Rust. To move up those abstraction layers, you need support for features, for those abstractions. In Rust C Python, we had a few missing features that were quite important. The first one of them was just a set. In Python, a set is a very useful collection.
08:02
That appears a lot in the Mercurial code base, and there was just no way of interacting or creating a set from Rust at that time, so we had to do that. Supports for capsules, so that's just like a pycapsule macro that would allow you to get and create a capsule.
08:20
And then a few more hairy ones, the first one being inheritance for classes written in Rust. You have a pyclass macro in Rust C Python that allows you to create a Python class backed by Rust collections or data in general. That's pretty cool, but if your Python code then tries to inherit
08:43
from your Rust-backed class, it will just crash and say, no, it's not a valid base type. You cannot inherit from this. The reason is that the feature was not added, because what happens if you forget to call the initializer?
09:00
What happens if you inherit and you just don't call super? What happens to the memory? It's a complicated program, because either you have some very strong runtime invariance or you just don't do it so that the... What was chosen was to just not do it. You have to use composition over inheritance, which sounds good in theory, but in practice,
09:20
it's just writing a lot of boilerplate code that adds an indirection, and that method calls in Python are pretty expensive, so if you're trying to shave off milliseconds, that's not really helping. So that's one of the issues that we have. Properties and the set adder, so giving instance attributes to instances of class,
09:42
so they could have attributes and properties. It's a very common pattern in Python, so if you're trying to have a dropping replacement, you just change a class. You can't, because there's no support yet for properties. And the last one, I want to talk a bit more about it, because I figured it's more interesting,
10:00
and it has to do with the terrible joke in the presentation title. Being iterators on Rust collections, and what that means is you have any Rust collection, for example, a vector, to take a simple one, and you want to share a reference to that object
10:20
with Python and Rust at the same time to allow for lower memory overhead, better performance, and just in general not moving the entire object through the FFI layer at once. So what that means is that a Python iterator in Rust should behave the same way as a Python iterator would,
10:42
which means that it has different rules about what you can and cannot do at runtime. So if you're trying to mutate an iterator, so calling next on a Python iterator, if you already have a reference to it, it's valid,
11:02
as long as you don't try to read the reference after that. If you then try to read from the iterator, it will get a runtime error and just say that something moved between the iterations or anything. This is different from what Rust does. Rust would just not allow to have those two references, one mutable, one immutable, at the same time
11:22
in the first place at compile time. So you have to use something called generational poisoning, which is just basically keeping a counter of which reference we were part of. So every older reference gets invalidated if you're moving something.
11:42
So that's one of the ways that you can do it. Of course, you have to tell the Rust compiler that it really has to let go of the memory, because you're trying to push memory and to share it with something that it has no control over, which is rarely not good in Rust terms, which is basically sharing the reference between the two languages.
12:01
So that was a proof of concept by my colleague George, over here, in June of last year. That was upstream that I upstreamed about a month later for a small but non-trivial data structure in Rust. And since then, it has been upstreamed by another material contributor in Rust CPython.
12:21
So it's something that you can use if you try to bridge the two languages. You can use this system that we've put into place. And it relies on the lifetime extension trick. So it means that you take in a lifetime and you just say, this is static. It's fine. Everything is okay.
12:41
And then it's unsafe, of course. It is FFI. It is unsafe. But the invariant to have it not actually unsafe is pretty easy to uphold. The only thing that you have to think about is to move the reference out of the boroscope.
13:02
You don't have to do any manual dropping, something very complicated. You just have to not move it out of the scope, which is reasonably easy to do in the context that we're working on because you have a py class, so it's pretty hard to move things out of it. I just said it uses generational poisoning.
13:21
So we did some upstream work, as I said. The py set has been upstream last year, so has the capsule support. This iterator thing is more general than just iterators. It's technically any shareable data. I actually still have an issue with a Python class
13:42
that I want to expose multiple parts of as separate Python classes, and that's just super... There's a ton of boilerplate involved if you're trying to do it because you have to always go through the same indirection layer. So maybe that can help. I still have to try. Properties are being worked on,
14:00
so that's going to happen. So that's our target. That's what we're trying to achieve, but it is the unrealistic target because it does not do what it should do. It does not do all of what it should do. So where are we now? I've been working on it for a few months,
14:21
and there was a lot of work that was ungratifying, as you saw, but we're starting to make a little ground. So I have two cases, the first one being a pathological case of something that is in our favor, so it's a particularly bad repository with very terrible performance. As you can see, the standard status takes about six seconds,
14:43
which is super slow, and the new version with some rust takes about one and a half, so it's better. It's definitely not perfect, but it's better. For a more realistic case, on any other repositories that I've seen,
15:02
you get more of a 50% increase in performance, which is definitely nice. 50% actually compounds a lot if you have a CI system that calls Mercurial a lot, but it's still so far from the 50 milliseconds that we had in the first place.
15:20
So what can we do and what can you do in general to make it go faster? It's not that complicated from a high-level perspective. Do more things in parallel, so we all know that building parallel code in Rust is a lot easier than in most other languages, especially in C.
15:42
And I think there's only, and I wrote the code so I have to remember, but there's only one loop that has been done in parallel, and it's not the most hungry one, as you could say. So there's still, the performance numbers that we just had was just optimizing one of the three main loops and not the biggest one,
16:01
so you still have a lot of performance to gain from that, from doing more things in parallel. Of course, better conditional execution, that just comes from the fact that rewriting a 15-year-old code base, you don't want to do it all at once. It's a very bad idea because you will get new bugs,
16:20
you will have two different complete implementations with basically just you to maintain it. It's just not a good idea. So you have to do it piece by piece at places that make sense, that are very performance critical. So to start, you just do some work that is just better in any way.
16:40
If it's good enough, you start upstreaming that, and then you work your way up. So better conditional execution would be to think about removing some paths in some cases and doing a little bit better. It's absolutely not specific to our case, just in general, try to move more incrementally. We're thinking the order of execution,
17:00
that has a little bit more to do with Rust because Rust allows us, with its very strong type system, to define better constraints on what we're able to do than what we could do with Python in the first place. So maybe we can do two loops at the same time that we didn't really know if we were capable of doing in Python
17:24
because it was just too complicated to do. Of course, there's fewer extensions between Python and Rust. As I just said, anything going through the FFI layer is just additional time lost. And the usual suspects of fewer allocation, memory alignment, and bit fields and all that kind of stuff.
17:41
So right now, I'm expecting to have way better performance without doing that kind of... I mean, I'm not doing completely wild allocations or anything, but you don't have to go particularly deep into performance to gain a lot of it for something very important,
18:01
like IO, for example. But also, you could just not start Python. I know this can be controversial, maybe less in that room than yesterday in the Python room. But Python has a startup time that kind of adds up.
18:24
A third of the entire run of the test suite that we have, we have about 900 integration tests, a third of that is just starting Python and getting the imports going through. And we have optimized imports in Mercurial. We've tried to get it to go better.
18:41
And if you're aiming for 50 millisecond status time, you've just lost, because you have to start Python. So I am absolutely not saying that Python is not a good language or that you should not use it, et cetera. I'm just saying that some use cases are better suited for faster languages or compiled languages.
19:03
For example, we could start embedding Python. So that means that Mercurial could become a Rust executable that embeds Python and that can fast-path some commands.
19:20
For example, the hdversion command has no reason to take more than one millisecond, for example, because it is just like, give me the version. And it actually can slow down considerably some CI systems. Even if it just takes 60 milliseconds or something, it can add up, so it's tiny things that make up a lot of difference.
19:41
And most CI systems don't care that much about the extendability, like using extensions and making custom things. Usually they want the diff, they want a status, they want a log, and they want it very fast. And they want to clone and that kind of stuff. It's not operations that are super complicated or user-based. They're based on data structures and them being executed fast.
20:06
So there's PyExodizer. Who's heard of PyExodizer here? A few people. All right, cool. It's something that was done by a major Mercurial contributor. It's a Rust crate that allows you to embed Python applications.
20:25
It's basically a tool for distribution at first, but you can do it for a little bit more than that. And we have a plan to start distributing Mercurial with PyExodizers for packaging reasons at first, but also now I have a Rust entry point, so I can just maybe not start Python in some cases.
20:43
So that would be a major performance gain, of course. Rust is a great language for writing VCS. I don't think, and that may be controversial, I don't think that Rust is a great language for any single application in much the same way that I don't think that Python
21:02
would be very nice for every single application. But VCS, they like data structure abstractions because a lot of it is just graphs and append-only databases and that kind of stuff that works really well with the trade system, basically.
21:24
And it's been very nice to work with Rust to have invariants that are compile time instead of just fingering out with asserts whether it's going to explode or not. VCS, they need to be both correct and fast, so you have different constraints than, say, in the video game industry in some aspects.
21:42
I'm not saying that they don't have to be correct at all, but it's not a huge deal if your game crashes sometimes, but if your VCS crashes and it crashes every second because there's like a million people using it. VCS, they like to do things in parallel because you have hundreds of thousands or millions of files
22:01
and most of it you just want to repeat the same operation for all of them and then collect the results, so Rust is very good for that. And also VCS, they have to work on bytes. They don't just like to work on bytes, they have to work on bytes because encoding is not always Unicode, I'm sorry. And I know that I would really love to just use path
22:20
from the standard Rust library to use a path, but we can't do that because it's a VCS and people who use different encoders than Unicode. So I like Rust, and I like it for writing VCS, fast stuff. And, yeah, thank you.
22:41
Do we have any questions? Yes. Most of the... Yeah, sorry. The question was did we do any profiling on the startup type of Python
23:00
basically to figure out... In general, yes. We do a lot of profiling. One of the reasons why we started to put C and Rust, et cetera, was because the profiler showed up some hot spots that we'd have to help. The startup time of Python is sometimes out of our hands. Most of what we can do about it is to simplify imports
23:22
and do less and do things lazily, but most of the work was already done a few years ago. So it's just a cost, because Python is like a fantastic machine to do very, very complicated stuff, so it has to have a cost. So I hope that answers your question. Yes. Have you heard of Mononoke, which is the...
23:43
Have I heard of Mononoke, which is the Facebook server implementation of Mercure? Yes, and I actually know one of the developers of Mononoke. He's a very nice person who is way better at Rust than I am. And he... Yeah. It's very interesting, but it's very good for Facebook. I'm not saying that the idea is not good in general.
24:02
It's a good idea, but it works for Facebook, because they have the problems that they have, and they're just using paths, because all of their paths are UTF-8, and so they don't care. So it's very specific, and it's cool, and some of the ideas that we can actually talk to them to maybe get them inside Mercure. But yes, I've definitely heard of it.
24:23
Do I have more time? Yes. Yes, that's one of the issues that we have.
24:41
So... Yes. Sorry, I really have to... How do we figure out how to handle extensions that can change commands if we're just not starting Python, which has... which withholds the entire extension system? Good question, because we still have to figure that out.
25:01
There's a need right now to get HD to go faster in general, so people will give up some extension system for their CI, for example. So I think a FastPass is acceptable as long as it's configurable. And after that, I really want to get extensions to go faster also, because repositories have been getting huge, and extensions are slow.
25:23
So that's another point. Should we rewrite the extensions in Rust? How do we interface? We're still not at that point right now. Thank you very much. Thanks.