Realtime 3D Graphics on a MicroPython ESP32
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 542 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung 2.0 Belgien: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/62029 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
FOSDEM 202379 / 542
2
5
10
14
15
16
22
24
27
29
31
36
43
48
56
63
74
78
83
87
89
95
96
99
104
106
107
117
119
121
122
125
126
128
130
132
134
135
136
141
143
146
148
152
155
157
159
161
165
166
168
170
173
176
180
181
185
191
194
196
197
198
199
206
207
209
210
211
212
216
219
220
227
228
229
231
232
233
236
250
252
256
258
260
263
264
267
271
273
275
276
278
282
286
292
293
298
299
300
302
312
316
321
322
324
339
341
342
343
344
351
352
354
355
356
357
359
369
370
372
373
376
378
379
380
382
383
387
390
394
395
401
405
406
410
411
413
415
416
421
426
430
437
438
440
441
443
444
445
446
448
449
450
451
458
464
468
472
475
476
479
481
493
494
498
499
502
509
513
516
517
520
522
524
525
531
534
535
537
538
541
00:00
MikrocontrollerProgrammiergerätBildbearbeitungsprogrammDeskriptive StatistikMereologie
00:49
SynchronisierungBitEreignishorizontLASER <Mikrocomputer>Geodätische LinieTelekommunikationRobotikHackerUnordnungComputeranimation
01:35
TouchscreenGerichtete MengePufferspeicherDatensichtgerätImplementierungFunktion <Mathematik>Minkowski-MetrikSichtenkonzeptPunktPerspektiveDivisionStellenringMinkowski-MetrikSichtenkonzeptTouchscreenMatrizenrechnungPunktModulDruckertreiberDigitale PhotographieDoS-AttackePeripheres GerätGrößenordnungRechter WinkelSoftwareFunktionalTreiber <Programm>KnotenmengeEndliche ModelltheorieLokales MinimumOrdnung <Mathematik>CASE <Informatik>Bitmap-GraphikDatensichtgerätRahmenproblemPixelPrimitive <Informatik>BildverstehenMereologieDarstellungsmatrixEchtzeitsystemModemPolygonSystemaufrufPersönliche IdentifikationsnummerLineares FunktionalShape <Informatik>KoordinatenProjektive EbeneNormalvektorDreieckVerschlingungWürfelPlastikkarteFontSimulationFigurierte ZahlImplementierungHardwareMultiplikationsoperatorSweep-AlgorithmusDatenfeldPuffer <Netzplantechnik>CodeQuick-SortSchreib-Lese-KopfStapeldateiAlgorithmusComputeranimation
09:08
DreieckDivisionPerspektiveRahmenproblemFunktion <Mathematik>MathematikModulMultiplikationMatrizenrechnungVektorraumProdukt <Mathematik>Case-ModdingGebäude <Mathematik>SpeicherbereinigungSystemzusammenbruchFirmwareModul <Datentyp>MaschinencodeSymboltabelleCodeAlgorithmusWürfelImplementierungKette <Mathematik>PerspektiveBitmap-GraphikMatrizenrechnungProzess <Informatik>PolygonMultiplikationMereologieMathematikProdukt <Mathematik>PhysikerPixelTouchscreenRichtungMultiplikationsoperatorCulling <Computergraphik>NormalvektorBitVektorraumRahmenproblemModulVolumenvisualisierungDreieckEinsEndliche ModelltheorieDivisionRenderingFirmwareKnotenmengeCodeProjektive EbeneQuellcodeInverser LimesLaufzeitfehlerSchnitt <Mathematik>Strategisches SpielBinder <Informatik>FehlermeldungElektronische PublikationKartesische KoordinatenProgrammierungFunktionalVerschlingungBimodulGleitkommarechnungVersionsverwaltungDarstellungsmatrixBildbearbeitungsprogrammMaschinensprachePunktComputeranimation
16:40
Case-ModdingGebäude <Mathematik>SystemzusammenbruchSpeicherbereinigungFirmwareModul <Datentyp>MaschinencodeSymboltabelleCodeOrdnungsreduktionFeuchteleitungArray <Informatik>Mailing-ListePASS <Programm>Funktion <Mathematik>Quick-SortCachingBetriebsmittelverwaltungProzess <Informatik>KnotenmengeStapeldateiModulDreieckMathematikRahmenproblemVektorraumMultiplikationMatrizenrechnungProdukt <Mathematik>VolumenvisualisierungWürfelEreignishorizontShape <Informatik>QuellcodeSpeicherbereinigungStapeldateiPatch <Software>Array <Informatik>Kartesische KoordinatenKnotenmengePhysikalisches SystemMailing-ListeMultiplikationsoperatorMathematikVererbungshierarchieGebäude <Mathematik>MereologieFunktionalQuick-SortObjekt <Kategorie>BetriebsmittelverwaltungInverser LimesLineares FunktionalInformationSpieltheorieRahmenproblemEinfache GenauigkeitShape <Informatik>Nichtlinearer OperatorMatrizenrechnungRechenbuchProjektive EbeneKlasse <Mathematik>ProgrammiergerätCachingSystemaufrufDifferenteSpeicherabzugGüte der AnpassungExpertensystemMaschinenspracheTypentheorieResultanteOverhead <Kommunikationstechnik>BenutzeroberflächeMehrkernprozessorDatenstrukturKeller <Informatik>VariableEndliche ModelltheorieOrdnungsreduktionFirmwareSystemzusammenbruchComputeranimation
24:13
EreignishorizontShape <Informatik>QuellcodeFlussdiagramm
Transkript: Englisch(automatisch erzeugt)
00:05
Microcontroller of hiker camps Okay Yep, my name is Matt booth. I'm here to talk to you about graphics programming in
00:21
Python on embedded microcontroller, which is Which is hilarious because I'm none of these things I'm not a graphics programmer I'm not a Python programmer and I'm not an embedded programmer. So we'll see how this goes It's For that reason I just you know, I can't emphasize enough this part of the talk description
00:42
This is not an instructional talk. This is just what I did So there's some background EMF camp is this weekend camping festival for hackers and makers And It's in a similar vein to the chaos communication camp and the the Dutch hacker
01:06
Festival, you know, there's robots and lasers and geodesic domes and things it's great fun if you get the opportunity to go I highly recommend it and It's a bit of a tradition of these style of events to give the attendees
01:23
Electronic event badges and The aim of these is to give attendees Opportunity to play with some hardware that they might not have come across before
01:41
And these are the two most recent badges from EMF camp the one on the left here If I told you they had They put a SIM card on it and a GSM modem and then they set up an on-site cell phone network You'll understand why it's made to look like a Nokia engage, but it's got all of the usual like
02:06
peripherals and sensors and things on there as well like accelerometers and Humidity and temperature and things and then it because it runs micro Python it allows people to Easily get started with experimenting with with that kind of hardware
02:23
The one on the right there is a newest one. These aren't these photographs aren't to scale by the way Let me just hold them up for comparison. The newest one is is much smaller the reasons for that You might guess is because of the silicon shortage that's been caused by fire flood and plague
02:44
As as you might expect but it's a it's still a lovely device the one on the left here You can see you might recognize this as a vision of the settlers of Catan. I spent a lot of time trying to Like isolate small parts of the screen to redraw because the update speed of that screen was so slow it was
03:07
Almost it's almost unusable for anything in real time. So when I got my hands on the new one this year I obviously wanted to see What this one could do? And so the first thing I wanted to do was to just try and Blitter full screen pixels to the device using the
03:27
Display driver directly and this took about 70 70 milliseconds, which is already orders of magnitude faster than the old batch if I Draw to you an off-screen buffer instead. That's way faster
03:43
But you know if you're doing that you then you have to get into their business of implementing your own Drawing functions for primitives and I didn't really want to do that That is ominous foreshadowing by the way and But I did discover that my person has his cool frame buff module which provides you with an off-screen frame buffer and also
04:08
Some drawing functions, which is great. So 41 milliseconds. I thought that was fair compromise. That's that's a good start now I've got a baseline for how fast I can draw to the screen
04:21
so Obviously what this is about is drawing 3d things to the screen of this device and so this is just here to in case you don't know this is Basically I Guess this is 3d rasterization 101. This is like the Minimum we have to do in order to get 3d points onto the screen
04:43
You know we start with our vertex coordinates And then that's multiplied by the model matrix to get into world space and then you multiply that by the view matrix to get the view space and then by the projection matrix to get the clip space and then The clip space allows you to see which vertices will be clipped by the edges of the screen or not
05:04
So then we once we know we've got the list of vertices. We want to render then we can Do the perspective division to bring that into? Normalized device coordinate space or NDC space that the perspective division is just the part that like makes the further away points closer together
05:21
So it gives you that illusion of 3d And then we've got to convert the normalized device coordinates Which are like between minus one and one to screen space, which is like our pixel coordinates and So when I was doing this
05:42
These to render these eight points on the screen from the cube It was pretty it wasn't too bad 53 seconds, and then if you like join those up To create your cube wireframe, it's not that much not that much slower. There's 12 triangles there, obviously
06:02
The next step is to then start filling in these triangles you want to draw solid shapes after all Annoyingly There's no method Or no function for doing that in the frame buff module for micro Python There is in the display driver, but as I mentioned like using the display driver directly
06:28
It is much slower because we're making many more calls to the hardware and you know We're setting pins high and low and stuff for every time we want to draw something I mean, you just want to do that once when we blip the whole thing to the screen and
06:42
Yeah, so frame buff doesn't provide a like polygon or polygon fill method and so I Do have to get into the business of writing these sort of functions myself after all So yeah, the display driver itself does have these methods so I obviously that's the first place I looked for
07:04
Implementation clues they have a polygon and a fill polygon method only Obviously there are problems with it and it's it's a little bit rubbish. Here's the
07:20
The figure on the left there is just using the outline polygon method and then the second one here is where is where I've tried to draw and fill poly a filled polygon over the top of the wireframe polygon and you can see just doesn't quite match up and So reading the code
07:41
There is it seems to be implementing like quite a well known or well documented Fill polygon method and there's a link to the website Where this algorithm is described and I also supplies a reference implementation so I was able to like copy the reference implementation to see if
08:00
It's like if the display drivers implementation was different and it isn't it's exactly the same It looks like the display drivers inherited the same problems that were in the reference implementation You'll notice that it's not only Incorrect on this side, but like the left edge here is completely different to this edge here So it's like over drawing on this side and not drawing enough on that side a
08:24
lot of the problems with it were sort of like rounding errors and like floating point to integer truncation and That sort of thing which I managed to mostly fix except for this really annoying pixel down here
08:40
I just couldn't get and When I submitted because I wanted to submit like this Enhancement to the frame of module upstream to the microplaton project and so we spent a few days Scratching our heads over this to try and figure out what we could do We were initially we proposed just drawing the outline again on top of that
09:02
On top of the field polygon just to like sweep it under the rug But eventually we managed to figure out a much better way of doing it We just like try to detect when these stray pixels were We're going to happen and then fill them in explicitly instead of letting the algorithm do it
09:34
Yeah, I Do things quite it's pretty obvious that the algorithm I think was developed by a physicist or a
09:42
mathematician because in the article that describes the The algorithm it says and I'm quoting here The detecting points on the polygon edge will deliver unpredictable results, but that is quote
10:01
not generally a problem because Quote the edge of the polygon is infinitely thin now My polygons have an edge of one pixel. So this is obviously why we had to like it Fix the problems with it
10:20
Anyway, now we can draw arbitrary polygons to the screen and let's see what that looks like This is the cube here again, which is like basically, you know, the hello world of 3d graphics programming and It seemed it seemed to work pretty well
10:40
66 milliseconds there, but you can see on the year on the left hand screenshot there That's not being say looks like you're looking at the inside of the cube but it's just because we are drawing the back face of the back of the cube on top of the front face of the front of the cube, so as part of this 3d rasterization process and you've now got to do like back face culling which is
11:05
More maths added on to that pipeline You know, you've got to take the You've got to calculate the normal vector of the face, which is the direction the face is facing and then compute the dot product of that with the direction you're looking so that you can know if the
11:24
Face if the triangle is facing you or not, and then just don't bother drawing the ones that aren't facing you But yeah, that's much. It's just more math. So it adds more time and oh, yeah, I get the occasional like really long frame and That coincides with a garbage collection, I guess we'll talk a bit more about that in a bit
11:49
Yeah, so like there's some really low hanging fruit things we can do to improve the performance initially, which is basically amounts to Being smarter about the algorithms we use we pre calculate the normals instead of calculating them every frame
12:03
Which for like static model like this makes total sense And yeah, avoid doing the perspective division if we can help it because it's like part of the I'd implemented it as part of the matrix multiplication process and usually
12:23
It's a and usually it's a no op unless you're multiplying it by the perspective matrix and Only then is it doing something and so we can just avoid doing those Those divisions at all on you know on every vertex In every face in every frame that's quite a lot of time
12:42
saved But it does mean I can add more things to it and make it do extra work like, you know You add a rudimentary lighting model and make the cube nice looking by adding shading and whatnot
13:00
The What I'm trying to do basically is to keep the rendering time below 100 milliseconds as well because that seems like a good target to have if I can do that then I get like a reasonable performance of 10 frames per second and so
13:21
This is although this is this works well that's within that target It's close to that target so I want to try something a bit more complex so I download a model of the industry-standard teapot and Try and render that this is about 240 faces 240 triangles and this obviously
13:46
Completely destroyed my 100 millisecond time limit And so I've got to think of I had to think of more ways to make this faster and the obvious way is to Rewrite all the hottest math functions in C as a mycoplasmic native module
14:05
The two ones that are called the most often are like the matrix the vector matrix multiplying method and the dot product method and yeah, you can see that more than cuts the time in half and
14:21
With the success of that It's pretty clear I should write rewrite all of the maths in C Because you know if I've got the bonnet up I might as well and but that you know that brings the time right down to a glorious Glorious glorious six frames per second
14:43
But yeah, I guess a general strategy if you find yourself calling a method, you know 12 1200 times afraid it's probably a good target to be To be pushed down into the native layer So yeah a note on Writing
15:02
Native code for micro Python there's really two ways of doing it. There's the What is called the external C modules which is basically C code that you write There's a module Exposed to the Python runtime those are compiled directly into the firmware
15:25
which is a bit suboptimal because I It would be nice if I didn't require other people who have these devices to reflash the firmware every time I Change this program. So the other way of doing it is to write what they call a native module
15:44
Which allows your application to supply native code as MPI MPY file and then that can be dynamically loaded by your application at runtime, which is a much nicer way of doing it So obviously that's what I wanted to do, but I did come across problems
16:02
When I try to build the native code because I've used a floating-point division in there for the perspective division step of the pipeline I Got this problem, which is a linker error from the Espressives tool chain for the ESP32
16:21
I'd love to know why this happens And if anyone from Espressiv is here, I'd love to know if it's fixed in the newer version as well But it seems like it can't link this software implementation of floating-point division so obviously what I did was I downloaded the source for Their tool chain and found the assembly implementation of this method to add into my project
16:45
which also didn't work the MicroPython build system wasn't prepared to accept that but that was an easy fix and that was actually the first change I got accepted into MicroPython. They were very good. They're very good at or in my experience They're very good at accepting patches and
17:01
Then once I got that building I got to just cause my application to crash I'm not sure why this happens, but there seems to be like a Reference to the native stuff that gets collected erroneously by the garbage collection
17:22
And I spent a lot of time like trying to reduce my object allocations You know in the frames, but all that did was just like push out the crash to further in the future And so, you know, I had to settle for compiling my maths functions directly into the firmware
17:43
There's some other things I did to try and make it faster the big one is trying to reduce objects in citations It is super costly in Python and Wherever you can pre allocate like lists and arrays and things and then just reuse them
18:06
I Initially wanted to have like a lot of my classes to be totally immutable as a good programmer that I am But they just just totally wasn't feasible So I just you know, you just have to meet mutate when you do calculations on your vertices
18:21
just mutate one of the operands and Send it back that way You can also the other thing I found that saves some time was reducing crossing Reducing the amount of times that we cross from Python into native code and back again. I found I was doing like lots of
18:40
The same operation to vertices and matrices. So if I could just send them all as one batch In a single function call into the native side, then that made it perform a lot quicker. I think there's a lot of function and stack manipulation overhead there that that you save and Also pass arrays and not lists into the native functions as well, especially for this kind of stuff where we know
19:06
The the data that we're passing are floats or or whatever, you know ahead of time What type is in your array? Which means you can make some assumptions that my Python can't make and when and when you manipulate this the data objects in a native side, you can like skip a bunch of
19:23
And like type save stuff you can just write directly to the to the data structure Which is useful and also I just won't surprise me as well that Well, I don't know if it's surprising. Maybe it's obvious to piece of people who are veteran Python Easter's, but I
19:43
Didn't expect to aid the native the libc Q sort function to be so much faster than the sort function in Python, but I was if you look at the if you look at this
20:01
This picture here You can see that some parts of the teapot are drawn on top of That should be occluded that they're drawn on top of the body of the teapot. So what I had to do was Zed sort the faces so that we draw the faces from From back to front and that's what I was doing. Oh, it's what I was Using the list sort method for here
20:22
but just like implementing this sorting this face sorting as a native function as well was like as it says there it's 100 times faster and The other thing that was made a measurable difference as well was locally caching object references in your functions as well, so like instead of if you're using
20:44
An object value more than once instead of doing self food self food self food just have yeah Just created a local reference a local Variable in the function and use that instead. So there's some like dereferencing overheads there that is quite significant that we're saving and
21:06
So after applying all of this sort of stuff This is the final result or you know the results so far I'm pretty happy with it getting it getting the the teapot model down to under 100 milliseconds per frame was
21:24
Really pleasing and Yeah, I'm pretty happy with the performance so what can this
21:41
be used for Honestly, this was a is this this was just a fun way to spend a few weekends after the festival had happened But you know, it seems to be performant enough the way you could do some kind of like small 3d
22:00
Game like a lunar lander or something like that or you know Make yourself a Jurassic Park style 3d user interface for your home automation But really the chief lesson For me, I think was the the best way to get involved with a project like micro Python was to just start using it
22:23
And eventually you come across some kind of limitation that probably your best place to Overcome because you know You're the one who's trying to solve the problem. You've got the vested interest in it You have you know, all of the information is currently paged into your brain. So
22:46
Yeah, and then The The micro Python people were extremely helpful in helping me whip up whip my year contributions into shape so yeah, thanks to them for
23:02
helping me get involved in micro Python and Thanks to you for listening. I can try and answer questions, but I'm not super expert on anything. I've been talking about
23:21
Hi, and thanks for your talk. I had a question about The ESP to that you were implementing on this Did you ever look at using like the dual core set up to try to sort of accelerate any of the mass?
23:42
that is a good question and Someone has mentioned this to me before but when I was writing this I was actually unaware that it had More than one core. So I haven't yet but it's a great idea Thanks very much for your talk
24:01
If you're interested in micro Python in the building a there is a stance about micro pythons and also a stand by Pint64 who make like smartwatch that can run micro Python and stuff