We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Realtime 3D Graphics on a MicroPython ESP32

00:00

Formale Metadaten

Titel
Realtime 3D Graphics on a MicroPython ESP32
Untertitel
Hacking the EMFCamp Conference Badge
Serientitel
Anzahl der Teile
542
Autor
Mitwirkende
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
This is not really a "how-to" -- it's more of a "what-did" I spent an unreasonable amount of time writing a software 3D renderer for an extremely small and low-power ESP32 device running MicroPython. I will talk about the problems I encountered, the optimisations I made, and the eventual contributions I was able to make back to the MicroPython project. If you've been to other hacker conferences like EMFCamp or MCH, you may be familiar with the tradition giving attendees interesting hardware on a lanyard instead of a traditional conference badge. Attendees are encouraged to experiment with and hack on the device both at the conference and afterwards. This year at EMFCamp 2022, the conference badge was a USB thumb drive sized ESP32 device running MicroPython. It has a joystick, an accelerometer, the cutest little LiPo battery, and a lovely 135x240 pixel colour TFT display. It seemed like it would be a fun single-weekend exercise to write a little 3D renderer for it in Python. It turned out to be a "fun" multi-weekend exercise, but I learned a lot that I'd like to share. I learned how to workaround problems in the tooling. I learned there were problems in the display driver. I learned about optimising Python. I learned how to write native MicroPython modules to rewrite some hot functions in C. I learned about optimising how the Python talks to the C. And I even learned how to contribute some improvements back to the MicroPython project. I will bring the device with me, catch me after the talk to see it rendering Utah teapots up close.
14
15
43
87
Vorschaubild
26:29
146
Vorschaubild
18:05
199
207
Vorschaubild
22:17
264
278
Vorschaubild
30:52
293
Vorschaubild
15:53
341
Vorschaubild
31:01
354
359
410
MikrocontrollerProgrammiergerätBildbearbeitungsprogrammDeskriptive StatistikMereologie
SynchronisierungBitEreignishorizontLASER <Mikrocomputer>Geodätische LinieTelekommunikationRobotikHackerUnordnungComputeranimation
TouchscreenGerichtete MengePufferspeicherDatensichtgerätImplementierungFunktion <Mathematik>Minkowski-MetrikSichtenkonzeptPunktPerspektiveDivisionStellenringMinkowski-MetrikSichtenkonzeptTouchscreenMatrizenrechnungPunktModulDruckertreiberDigitale PhotographieDoS-AttackePeripheres GerätGrößenordnungRechter WinkelSoftwareFunktionalTreiber <Programm>KnotenmengeEndliche ModelltheorieLokales MinimumOrdnung <Mathematik>CASE <Informatik>Bitmap-GraphikDatensichtgerätRahmenproblemPixelPrimitive <Informatik>BildverstehenMereologieDarstellungsmatrixEchtzeitsystemModemPolygonSystemaufrufPersönliche IdentifikationsnummerLineares FunktionalShape <Informatik>KoordinatenProjektive EbeneNormalvektorDreieckVerschlingungWürfelPlastikkarteFontSimulationFigurierte ZahlImplementierungHardwareMultiplikationsoperatorSweep-AlgorithmusDatenfeldPuffer <Netzplantechnik>CodeQuick-SortSchreib-Lese-KopfStapeldateiAlgorithmusComputeranimation
DreieckDivisionPerspektiveRahmenproblemFunktion <Mathematik>MathematikModulMultiplikationMatrizenrechnungVektorraumProdukt <Mathematik>Case-ModdingGebäude <Mathematik>SpeicherbereinigungSystemzusammenbruchFirmwareModul <Datentyp>MaschinencodeSymboltabelleCodeAlgorithmusWürfelImplementierungKette <Mathematik>PerspektiveBitmap-GraphikMatrizenrechnungProzess <Informatik>PolygonMultiplikationMereologieMathematikProdukt <Mathematik>PhysikerPixelTouchscreenRichtungMultiplikationsoperatorCulling <Computergraphik>NormalvektorBitVektorraumRahmenproblemModulVolumenvisualisierungDreieckEinsEndliche ModelltheorieDivisionRenderingFirmwareKnotenmengeCodeProjektive EbeneQuellcodeInverser LimesLaufzeitfehlerSchnitt <Mathematik>Strategisches SpielBinder <Informatik>FehlermeldungElektronische PublikationKartesische KoordinatenProgrammierungFunktionalVerschlingungBimodulGleitkommarechnungVersionsverwaltungDarstellungsmatrixBildbearbeitungsprogrammMaschinensprachePunktComputeranimation
Case-ModdingGebäude <Mathematik>SystemzusammenbruchSpeicherbereinigungFirmwareModul <Datentyp>MaschinencodeSymboltabelleCodeOrdnungsreduktionFeuchteleitungArray <Informatik>Mailing-ListePASS <Programm>Funktion <Mathematik>Quick-SortCachingBetriebsmittelverwaltungProzess <Informatik>KnotenmengeStapeldateiModulDreieckMathematikRahmenproblemVektorraumMultiplikationMatrizenrechnungProdukt <Mathematik>VolumenvisualisierungWürfelEreignishorizontShape <Informatik>QuellcodeSpeicherbereinigungStapeldateiPatch <Software>Array <Informatik>Kartesische KoordinatenKnotenmengePhysikalisches SystemMailing-ListeMultiplikationsoperatorMathematikVererbungshierarchieGebäude <Mathematik>MereologieFunktionalQuick-SortObjekt <Kategorie>BetriebsmittelverwaltungInverser LimesLineares FunktionalInformationSpieltheorieRahmenproblemEinfache GenauigkeitShape <Informatik>Nichtlinearer OperatorMatrizenrechnungRechenbuchProjektive EbeneKlasse <Mathematik>ProgrammiergerätCachingSystemaufrufDifferenteSpeicherabzugGüte der AnpassungExpertensystemMaschinenspracheTypentheorieResultanteOverhead <Kommunikationstechnik>BenutzeroberflächeMehrkernprozessorDatenstrukturKeller <Informatik>VariableEndliche ModelltheorieOrdnungsreduktionFirmwareSystemzusammenbruchComputeranimation
EreignishorizontShape <Informatik>QuellcodeFlussdiagramm
Transkript: Englisch(automatisch erzeugt)
Microcontroller of hiker camps Okay Yep, my name is Matt booth. I'm here to talk to you about graphics programming in
Python on embedded microcontroller, which is Which is hilarious because I'm none of these things I'm not a graphics programmer I'm not a Python programmer and I'm not an embedded programmer. So we'll see how this goes It's For that reason I just you know, I can't emphasize enough this part of the talk description
This is not an instructional talk. This is just what I did So there's some background EMF camp is this weekend camping festival for hackers and makers And It's in a similar vein to the chaos communication camp and the the Dutch hacker
Festival, you know, there's robots and lasers and geodesic domes and things it's great fun if you get the opportunity to go I highly recommend it and It's a bit of a tradition of these style of events to give the attendees
Electronic event badges and The aim of these is to give attendees Opportunity to play with some hardware that they might not have come across before
And these are the two most recent badges from EMF camp the one on the left here If I told you they had They put a SIM card on it and a GSM modem and then they set up an on-site cell phone network You'll understand why it's made to look like a Nokia engage, but it's got all of the usual like
peripherals and sensors and things on there as well like accelerometers and Humidity and temperature and things and then it because it runs micro Python it allows people to Easily get started with experimenting with with that kind of hardware
The one on the right there is a newest one. These aren't these photographs aren't to scale by the way Let me just hold them up for comparison. The newest one is is much smaller the reasons for that You might guess is because of the silicon shortage that's been caused by fire flood and plague
As as you might expect but it's a it's still a lovely device the one on the left here You can see you might recognize this as a vision of the settlers of Catan. I spent a lot of time trying to Like isolate small parts of the screen to redraw because the update speed of that screen was so slow it was
Almost it's almost unusable for anything in real time. So when I got my hands on the new one this year I obviously wanted to see What this one could do? And so the first thing I wanted to do was to just try and Blitter full screen pixels to the device using the
Display driver directly and this took about 70 70 milliseconds, which is already orders of magnitude faster than the old batch if I Draw to you an off-screen buffer instead. That's way faster
But you know if you're doing that you then you have to get into their business of implementing your own Drawing functions for primitives and I didn't really want to do that That is ominous foreshadowing by the way and But I did discover that my person has his cool frame buff module which provides you with an off-screen frame buffer and also
Some drawing functions, which is great. So 41 milliseconds. I thought that was fair compromise. That's that's a good start now I've got a baseline for how fast I can draw to the screen
so Obviously what this is about is drawing 3d things to the screen of this device and so this is just here to in case you don't know this is Basically I Guess this is 3d rasterization 101. This is like the Minimum we have to do in order to get 3d points onto the screen
You know we start with our vertex coordinates And then that's multiplied by the model matrix to get into world space and then you multiply that by the view matrix to get the view space and then by the projection matrix to get the clip space and then The clip space allows you to see which vertices will be clipped by the edges of the screen or not
So then we once we know we've got the list of vertices. We want to render then we can Do the perspective division to bring that into? Normalized device coordinate space or NDC space that the perspective division is just the part that like makes the further away points closer together
So it gives you that illusion of 3d And then we've got to convert the normalized device coordinates Which are like between minus one and one to screen space, which is like our pixel coordinates and So when I was doing this
These to render these eight points on the screen from the cube It was pretty it wasn't too bad 53 seconds, and then if you like join those up To create your cube wireframe, it's not that much not that much slower. There's 12 triangles there, obviously
The next step is to then start filling in these triangles you want to draw solid shapes after all Annoyingly There's no method Or no function for doing that in the frame buff module for micro Python There is in the display driver, but as I mentioned like using the display driver directly
It is much slower because we're making many more calls to the hardware and you know We're setting pins high and low and stuff for every time we want to draw something I mean, you just want to do that once when we blip the whole thing to the screen and
Yeah, so frame buff doesn't provide a like polygon or polygon fill method and so I Do have to get into the business of writing these sort of functions myself after all So yeah, the display driver itself does have these methods so I obviously that's the first place I looked for
Implementation clues they have a polygon and a fill polygon method only Obviously there are problems with it and it's it's a little bit rubbish. Here's the
The figure on the left there is just using the outline polygon method and then the second one here is where is where I've tried to draw and fill poly a filled polygon over the top of the wireframe polygon and you can see just doesn't quite match up and So reading the code
There is it seems to be implementing like quite a well known or well documented Fill polygon method and there's a link to the website Where this algorithm is described and I also supplies a reference implementation so I was able to like copy the reference implementation to see if
It's like if the display drivers implementation was different and it isn't it's exactly the same It looks like the display drivers inherited the same problems that were in the reference implementation You'll notice that it's not only Incorrect on this side, but like the left edge here is completely different to this edge here So it's like over drawing on this side and not drawing enough on that side a
lot of the problems with it were sort of like rounding errors and like floating point to integer truncation and That sort of thing which I managed to mostly fix except for this really annoying pixel down here
I just couldn't get and When I submitted because I wanted to submit like this Enhancement to the frame of module upstream to the microplaton project and so we spent a few days Scratching our heads over this to try and figure out what we could do We were initially we proposed just drawing the outline again on top of that
On top of the field polygon just to like sweep it under the rug But eventually we managed to figure out a much better way of doing it We just like try to detect when these stray pixels were We're going to happen and then fill them in explicitly instead of letting the algorithm do it
Yeah, I Do things quite it's pretty obvious that the algorithm I think was developed by a physicist or a
mathematician because in the article that describes the The algorithm it says and I'm quoting here The detecting points on the polygon edge will deliver unpredictable results, but that is quote
not generally a problem because Quote the edge of the polygon is infinitely thin now My polygons have an edge of one pixel. So this is obviously why we had to like it Fix the problems with it
Anyway, now we can draw arbitrary polygons to the screen and let's see what that looks like This is the cube here again, which is like basically, you know, the hello world of 3d graphics programming and It seemed it seemed to work pretty well
66 milliseconds there, but you can see on the year on the left hand screenshot there That's not being say looks like you're looking at the inside of the cube but it's just because we are drawing the back face of the back of the cube on top of the front face of the front of the cube, so as part of this 3d rasterization process and you've now got to do like back face culling which is
More maths added on to that pipeline You know, you've got to take the You've got to calculate the normal vector of the face, which is the direction the face is facing and then compute the dot product of that with the direction you're looking so that you can know if the
Face if the triangle is facing you or not, and then just don't bother drawing the ones that aren't facing you But yeah, that's much. It's just more math. So it adds more time and oh, yeah, I get the occasional like really long frame and That coincides with a garbage collection, I guess we'll talk a bit more about that in a bit
Yeah, so like there's some really low hanging fruit things we can do to improve the performance initially, which is basically amounts to Being smarter about the algorithms we use we pre calculate the normals instead of calculating them every frame
Which for like static model like this makes total sense And yeah, avoid doing the perspective division if we can help it because it's like part of the I'd implemented it as part of the matrix multiplication process and usually
It's a and usually it's a no op unless you're multiplying it by the perspective matrix and Only then is it doing something and so we can just avoid doing those Those divisions at all on you know on every vertex In every face in every frame that's quite a lot of time
saved But it does mean I can add more things to it and make it do extra work like, you know You add a rudimentary lighting model and make the cube nice looking by adding shading and whatnot
The What I'm trying to do basically is to keep the rendering time below 100 milliseconds as well because that seems like a good target to have if I can do that then I get like a reasonable performance of 10 frames per second and so
This is although this is this works well that's within that target It's close to that target so I want to try something a bit more complex so I download a model of the industry-standard teapot and Try and render that this is about 240 faces 240 triangles and this obviously
Completely destroyed my 100 millisecond time limit And so I've got to think of I had to think of more ways to make this faster and the obvious way is to Rewrite all the hottest math functions in C as a mycoplasmic native module
The two ones that are called the most often are like the matrix the vector matrix multiplying method and the dot product method and yeah, you can see that more than cuts the time in half and
With the success of that It's pretty clear I should write rewrite all of the maths in C Because you know if I've got the bonnet up I might as well and but that you know that brings the time right down to a glorious Glorious glorious six frames per second
But yeah, I guess a general strategy if you find yourself calling a method, you know 12 1200 times afraid it's probably a good target to be To be pushed down into the native layer So yeah a note on Writing
Native code for micro Python there's really two ways of doing it. There's the What is called the external C modules which is basically C code that you write There's a module Exposed to the Python runtime those are compiled directly into the firmware
which is a bit suboptimal because I It would be nice if I didn't require other people who have these devices to reflash the firmware every time I Change this program. So the other way of doing it is to write what they call a native module
Which allows your application to supply native code as MPI MPY file and then that can be dynamically loaded by your application at runtime, which is a much nicer way of doing it So obviously that's what I wanted to do, but I did come across problems
When I try to build the native code because I've used a floating-point division in there for the perspective division step of the pipeline I Got this problem, which is a linker error from the Espressives tool chain for the ESP32
I'd love to know why this happens And if anyone from Espressiv is here, I'd love to know if it's fixed in the newer version as well But it seems like it can't link this software implementation of floating-point division so obviously what I did was I downloaded the source for Their tool chain and found the assembly implementation of this method to add into my project
which also didn't work the MicroPython build system wasn't prepared to accept that but that was an easy fix and that was actually the first change I got accepted into MicroPython. They were very good. They're very good at or in my experience They're very good at accepting patches and
Then once I got that building I got to just cause my application to crash I'm not sure why this happens, but there seems to be like a Reference to the native stuff that gets collected erroneously by the garbage collection
And I spent a lot of time like trying to reduce my object allocations You know in the frames, but all that did was just like push out the crash to further in the future And so, you know, I had to settle for compiling my maths functions directly into the firmware
There's some other things I did to try and make it faster the big one is trying to reduce objects in citations It is super costly in Python and Wherever you can pre allocate like lists and arrays and things and then just reuse them
I Initially wanted to have like a lot of my classes to be totally immutable as a good programmer that I am But they just just totally wasn't feasible So I just you know, you just have to meet mutate when you do calculations on your vertices
just mutate one of the operands and Send it back that way You can also the other thing I found that saves some time was reducing crossing Reducing the amount of times that we cross from Python into native code and back again. I found I was doing like lots of
The same operation to vertices and matrices. So if I could just send them all as one batch In a single function call into the native side, then that made it perform a lot quicker. I think there's a lot of function and stack manipulation overhead there that that you save and Also pass arrays and not lists into the native functions as well, especially for this kind of stuff where we know
The the data that we're passing are floats or or whatever, you know ahead of time What type is in your array? Which means you can make some assumptions that my Python can't make and when and when you manipulate this the data objects in a native side, you can like skip a bunch of
And like type save stuff you can just write directly to the to the data structure Which is useful and also I just won't surprise me as well that Well, I don't know if it's surprising. Maybe it's obvious to piece of people who are veteran Python Easter's, but I
Didn't expect to aid the native the libc Q sort function to be so much faster than the sort function in Python, but I was if you look at the if you look at this
This picture here You can see that some parts of the teapot are drawn on top of That should be occluded that they're drawn on top of the body of the teapot. So what I had to do was Zed sort the faces so that we draw the faces from From back to front and that's what I was doing. Oh, it's what I was Using the list sort method for here
but just like implementing this sorting this face sorting as a native function as well was like as it says there it's 100 times faster and The other thing that was made a measurable difference as well was locally caching object references in your functions as well, so like instead of if you're using
An object value more than once instead of doing self food self food self food just have yeah Just created a local reference a local Variable in the function and use that instead. So there's some like dereferencing overheads there that is quite significant that we're saving and
So after applying all of this sort of stuff This is the final result or you know the results so far I'm pretty happy with it getting it getting the the teapot model down to under 100 milliseconds per frame was
Really pleasing and Yeah, I'm pretty happy with the performance so what can this
be used for Honestly, this was a is this this was just a fun way to spend a few weekends after the festival had happened But you know, it seems to be performant enough the way you could do some kind of like small 3d
Game like a lunar lander or something like that or you know Make yourself a Jurassic Park style 3d user interface for your home automation But really the chief lesson For me, I think was the the best way to get involved with a project like micro Python was to just start using it
And eventually you come across some kind of limitation that probably your best place to Overcome because you know You're the one who's trying to solve the problem. You've got the vested interest in it You have you know, all of the information is currently paged into your brain. So
Yeah, and then The The micro Python people were extremely helpful in helping me whip up whip my year contributions into shape so yeah, thanks to them for
helping me get involved in micro Python and Thanks to you for listening. I can try and answer questions, but I'm not super expert on anything. I've been talking about
Hi, and thanks for your talk. I had a question about The ESP to that you were implementing on this Did you ever look at using like the dual core set up to try to sort of accelerate any of the mass?
that is a good question and Someone has mentioned this to me before but when I was writing this I was actually unaware that it had More than one core. So I haven't yet but it's a great idea Thanks very much for your talk
If you're interested in micro Python in the building a there is a stance about micro pythons and also a stand by Pint64 who make like smartwatch that can run micro Python and stuff