The Tamil Driver

Video in TIB AV-Portal: The Tamil Driver

Formal Metadata

The Tamil Driver
Early hacking on the Mali T-series GPUs
Alternative Title
Graphics - Tamil
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Year

Content Metadata

Subject Area
Slide rule Scheduling (computing) Computer file Code Multiplication sign Execution unit Control flow Set (mathematics) Compiler Mereology Proper map Arm Theory Revision control Heegaard splitting Radio-frequency identification Computer hardware Software testing Endliche Modelltheorie Series (mathematics) Computer architecture Source code Slide rule Information Inheritance (object-oriented programming) Physical law Projective plane Shared memory Content (media) Bit Instance (computer science) Line (geometry) Device driver Device driver Right angle Quicksort Reverse engineering
Demo (music) Parsing Wiki Roundness (object) Radio-frequency identification Hypermedia Kernel (computing) Core dump Arm Texture mapping Binary code Electronic mailing list Bit Median Instance (computer science) Category of being Arithmetic mean Data management Process (computing) Order (biology) Buffer solution Quicksort Spacetime Point (geometry) Slide rule Computer file Variety (linguistics) Robot Motion capture Streaming media Raw image format Product (business) Goodness of fit Data structure Metropolitan area network Compilation album Pairwise comparison Standard deviation Graph (mathematics) Information Demo (music) Content (media) Line (geometry) Binary file Compiler Network topology Device driver Codec Game theory Table (information) State of matter Code Multiplication sign Set (mathematics) Compiler Mereology Proper map Arm Heegaard splitting Mathematics Semiconductor memory Endliche Modelltheorie Graphics processing unit Area Overlay-Netz Email File format Variable (mathematics) Flow separation Parsing Data mining Uniformer Raum Volumenvisualisierung Right angle Frame problem Server (computing) Discrete element method Binary file Theory Attribute grammar Revision control Device driver Integer Plug-in (computing) Vorwärtsfehlerkorrektur Slide rule Cellular automaton Projective plane Mathematical analysis Planning Device driver Residual (numerical analysis) Subject indexing Cache (computing) Kernel (computing) Computer hardware Blog Formal grammar Vertex (graph theory) Electronic visual display
Revision control Type theory Category of being Beta function Kernel (computing) Mapping Software Quicksort Device driver Frame problem Power (physics) Graphics processing unit
Pixel Code Multiplication sign Execution unit 1 (number) Mereology Proper map Computer programming Roundness (object) Semiconductor memory Different (Kate Ryan album) Core dump Matrix (mathematics) Electronic visual display Software framework Endliche Modelltheorie Series (mathematics) Position operator Graphics processing unit Rotation Touchscreen File format Binary code Shared memory Bit Database transaction Instance (computer science) Sequence Band matrix Process (computing) Ring (mathematics) Uniformer Raum Buffer solution Order (biology) MiniDisc Website Right angle Quicksort Writing Resultant Spacetime Geometry Point (geometry) Slide rule Game controller Streaming media Attribute grammar Number Revision control Frequency Gaussian elimination Computer hardware Data structure Mathematical optimization Address space Demo (music) Information Tesselation Horizon Line (geometry) Device driver Uniform resource locator Maize Kernel (computing) Shader <Informatik> Cube Table (information) Local ring
Greatest element Multiplication sign Formal language Software bug Wave packet Programmer (hardware) Crash (computing) Goodness of fit Computer hardware Videoconferencing Software testing Software framework Endliche Modelltheorie Internationalization and localization Social class Graphics processing unit Area Arm Planning Bit First-person shooter Volume (thermodynamics) Line (geometry) Device driver Frame problem Parsing Category of being Word Loop (music) Process (computing) Device driver Self-organization Quicksort Library (computing)
no so passed from the start it's gonna be a really short thought this time 6 slides and so on and then we'll times and so we have an when I started doing open armed if was in it was shortly after files them 2011 that we form the idea and inspired and as some of you might remember we were here in 2012 I was in 2012 of 1 4 down 1 room over there and showing off Lena which was a very intense intense experiences room was so full and it had spent 3 months hacking away like crazy and was very nice experience them and what I said in the next year in 2012 in this we can then turn up here and there some shares stuff because a split up project initiators and usually do then was there having Connor 0 16 and then 14 right what when you start 15 1 excessively on it in 1 fellows in what else is addressed and so we could send them on hardware because in 1st half the house I have to ask his parents where we can give that sort of information and so what I said to all the 2012 from the 2nd of which 2012 that it Chromebook came out young Chromebook which I will be showing off on and came out in what I said as we will never ever work on it on the newer version of the Molly before reproduce a proper driver for the older version of the line so we 1st initially man and have a proper working driver there and we will never and I put my foot down hard to you and to cover all the time that we would never ever do that so here I am the other coders CS still lying around the corner wrote for last year is still lying around the RepLab cleanup yet so need to push it it wasn't working fully at some things were um some appeals summer working so there's there's a lot to do still but and so the In the slide I will try to not mention the model and I even tried rebranding at the it shows was so it's 2 separate drivers anyway and hardware doesn't share that much feels the same instance of a Brevard anyway that's why we got a new name so yeah the them I some few months ago I said no this year I'm not doing anything for files and my go and ask you what are you doing that usually do something known at this year and I'm taking it quiet and and going to take to really calm Christmas break in common in January and somewhere around the Christmas Iast again what are you doing for and is it well the schedule is a bit and the and there's not that many people on a topic of unit probably not hear people like him were too lazy to filing of thought you know we have studied things prepared from LC and and that these and that is final talk this I like the slope because I can or will also tell you if the end the clean up and don't so that I don't have that but always nice and so by this we did what I do well here and falls and fever right something you and i've been really lazy about it I only have 6 lines and this is 1 of those 6 and 2 of them no 1 says them or any other such questions and I the set have 3 slides with actual content and the first one is actually fully recycled from last year so last year surely every briefly mentioned that I was working on on the Tamil driver just just put that slide back in so that's that's that's really nothing and the i wanna reasons why I did this again was that it was in the the there was a bit and this it's always difficult to find people to talk about graphics I don't understand why I am because I know that there is always a good probably a lots of people want to give a interesting talks which they usually are but I always have to run behind people and try to get them to file talks um get am nothing to do in January of customers so why not wanna do this and I it's been a while since I last test of my good friend that orange invaded so this is for you some the now I'm lazy and laws of the of the which is that was bit very lazy and only have 1 them all as well for you guys and nothing more just 1 random that's it no progression as usual and looking at 1 cubic in the next few and that's the boring so I didn't even bother with that and there have been so lazy that I've only written 10 thousand lines of code in last 6 weeks the Fourier let's go to the actual contents so this is fully last year slide as I promised no content of and so the Tamil driver is just that can accompany the 1st part and Molly the series and which 1st got released in 2012 already so what the X announced by series that was 20 12 the October 2012 which is all very very long ago with 2015 we started in the 20 12 in the beginning of the years when I showed Lima so we had a few only a few months before a hardware came out and we haven't done anything much on it well enough but we will show that where I was comma is actually pretty far along as well as some regret and use up because it turnout not has been set from the start food that's the U. S. and shaded unified theory that is in the Mali hardware these days in the PC unified before it was separate from vertex nodes in the 5 there would always be based on the really crazy and but architecture the fragmented and it is so I can't had a bit of fun you really had a bit of fun it was announced as horrible as the probably the 1st reverse-engineering was still there and 1 thing that happened in the last year and after showing this slide I spent the wild whining about a lack of a community for for instance x announced and I'm been I am very active as this Hansen and probably a few other guys in this room in the to community and this thing is very very unique at least that was very unique year-ago um and I complain about that this sort of community doesn't exist for X and also for other resources and it really we need to do the situation was
such that all the information of the way I set up this problem for instance is just a few blog entries left and right and the information is just updated everywhere you have to kind of pieces to get yourself there is no there was no nice wiki there was no meaning this there is no ISI channel so at the talk around after my tyrannized here and somebody in the audience and the last year so what we have to do and the 1st completely misinterpreted as this question answered something else entirely which is something that there's some times of recommender the speakers that just um answer question that you uh I think was asked but I have listened more closely and no land can you stand up this guy in the last year I brought up a server meaningless minus channel and we actually have quite a community in the meantime it's not as big year doesn't sound too because we something we've been out of for 3 years and the community that was started X announced has been growing slowly over here but it's getting somewhere and it's it's quite amazing so and thought so yeah and last year and uh 1 wasn't September October 2013 has been like 3 weeks working on the panel on on the new melody serious um I spent we can have have looking whether a command seeing what was being said between of intron around between the kernel and user space and I could use people like at that before with the only man I have to use the I have to fix up some um memory permissions as well so that connects to the trace of the features could access the memory of the Mali so this kind of like you there's also list that build on on the arm Chromebook even order was no community and so i . Capture replay fully working already which is a big 1st step for starting at residue project and I also spend a bit of time exposing the data compiled alike and with the older uh Molly versions of which we use linear well 1st of all the product side to waste a split them up it's usually yield 2 separate projects anyway and there's associated compiler under the command streams of yeah lots slightly different skills needed for each and there's a lot of work in use and it's something that is nicely separated of also you can have 2 people working in their own little area and they need very little community of course they need some money and then and there is some information that you have to have available but it's a nice nicer both but of its size with you know to have a team work on 1 guy and 1 guy on the same project so I spend some time looking at a similar and finding the down this the the compiler and what the structure that it needs to be passed into have hidden looks like like expose that has stopped working on it for more than a year but it was still something that the show and and mining of talk last year so this year I mean a lot smarter have 4 years ago I have never done you have a work I had always done display driver work display driver worked a bit of the acceleration in of media that of everything in a change quite a lot in those areas but never attached to you know because there was there is enough to do everywhere anyway so where things so that I was going to open until area of anyway I still work even had to learn about open GL it never done that before to read a book for change so we have I was not really sure what to do in initially so I just hope that raw memory dumps and this is what you guys saw in 2012 was me you looking at Runnymede guns and then trying to make sense of it and writing up some code to do with that actually uses and these models for 5 renders it yourself and so it I waste a lot of time just looking at at that almost riots were at least in 2 years but integer so it wasn't that various none of that efficient and this year and note for 3 arena when I brought that up I thought OK this doesn't work anymore it's just too complex I'm not going to will be able to handle of information so right don't you do the smart and find the right of parts related so I wrote that and from last time that and that's why the year after the sort of uh showed up financial and quickly arena this same rounds after capture replay after exposing the binary compiler OK now sit down and write the parts and I could used probably uh insidious on both anvils revenge but instead they kind like another workflow and where I had done to see final whatever whatever I don't know whatever I graph from other commands imagine the c file uh I can then compile another file with that file and just replay it now I the parser is also just plain silly and what it does is just dumped all the structure where the 1 header file I just use 1 header file all the commands for information and and it's all in 1 work for just build 1 of the other to the other and if if if you still shows up the replay of person and something good engine wasn't complaining than I have something good so I built up a full parser um myself of also no external and they're only see it the real line I don't you like revenge much um so I know quite a lot of of at least all the things of what's needed for uh for the time of arrival formalities use and where textures live uh what the structures look like it's all the questions have been answered most of the tiny bits ITL states for um typing for uniforms attributes and and varieties which of the 3 variables that you use between changes on a don't have that figured out yet but it's a tiny bit of busy work it just takes time um there's nothing big scary waiting anymore and again everything is a completely different all feel the same but it's not the same the only coded I could reuse from me now that is a bit of texturing code and so I still haven't found a bit
that's a set is about 11 bottle to find the bit that just says just users as a plane texture it's usually Switzerl's it's that it reactions of others that other categories of media and their tapping is a bit easier on the scaling so that's the only bit of quot I could really use it's about 50 lines of C how everything else is is pretty new at all feel the same and it was an accident all because I had the experience of the 1st of series of and you code and so on and I world of the part of both of the parts a and I was kind of 10 working 2 weeks ago and then I took the more difficult render and said OK I have this big see file with a bunch of structures in there and will overtake state and so that's wrong and their the fixed still raw and they're they're just spread part 1 by 1 and build up build up to them on caught up use proper textures you use proper vertex stated a use even though the binary compiler because it's the same game again binary compilers what I use somebody else it if somebody else is interested in and they can go finish Connors work and essence is now at college or at Intel and the theory he's a bit busy and the so it's new um I are infrastructure for Mr. check see fit quite well for this Molly version and for the 2 versions before or for the 2 ends in of the to share engines before and on and yet I have 1 them all which is working quite nice in comparison to what I showed off 2 years ago you guys might remember this cue that was spinning for a bit and then the caching would go off completely new become pretty ugly but this time it actually works quite well and about this binary compiler um arm has a format called envious and so I was kind of all political reuse and years it is the parts that are built up around like 2 years ago and it's just a file that describes the uniforms live here look like they have your Verizon attributes million of them and then there's the ball with the the actual data turns out of this version of of and as looks the same have the same Canada files that in can iteratively divides and all all the content is pretty much different so I got write that 1 as well not so it's supposed to be the same hello other a project with a probably chosen um to continue original project this that's completely separate hard right these of course that doesn't make sense to have everything in 1 coterie 2 separate drivers at small manageable that's much nicer then it's at the very very amazing so the bin larger is your bot standard that but crumbled that everybody has and this cell on Amazon for a long while I think and 20 12 2013 so it as of very very bad kernel running and this is an installation from summer to 2013 so as September October 2013 is when I brought up the um capture replay and brought up to the compiler or found the compile in In the binary the it has the most horrible graphics driver that I have seen since I started doing this and doing graphics drivers for Linux and I started in 20 thousand sorry 2 thousand trees when I started 11 years ago were and um boat setting was wasn't the thing you have the work that didn't existed at that time and and when I was still using out of looking at time registered on this prior has reduces not registered still today um if you're doing it then more than you usually 1 external an external connector to work and so if you plug in H C minus thing index at especially in this version of the plugin hdm-i under fixed and you try to leave extending get a kernel panic and if you and try to use it as a kinase driver but uh it's all the names in the same thing because if you try to use him as so an overlay and you can allocate a buffer just fine and you can then try to assign it to an overly and enable and then median nice kernel panic so my demo is going to be rendering to bit of memory then doing CPI to the frame buffer so it's this really great for performance in doing ties to work likely it's a dual so as to have the highly won't museum 1 it's all the same process the new evidence it's it's really terrible and I was wondering for months and months why people are always complaining among buyers about how bad analysis yeah I haven't yet and the thing is a code is as not improve enough the what the it has been fixed no it hasn't and no it might not kernel panic anymore but closest registered tables are still in there this I can never existed and will this but then the point it the OK so you're 1 of the polar some some guys are I think and you're working quite hard and cleaning up the various but I've seen no movement on and on getting these registered tables outside of of of of the states in Michael the OK so so it's a process thinking the so you have written up of proper bit of code to the deal properly for fixed my is it still table OK sorry about it just asking the so then let's just go through the meat and bones of the stock of that show you that the fact that the yeah this slide so that we know that we have 5 slides so does them all with the memcpy
CPI and there's a long 1 the the the and I mean there's nothing connected to this apart from its mind powers so is no networking between the 2 of them and it's going to fire just going to type the dental and whatever frames per 2nd it's all putting it's going to scroll in the back wall the um NCBI properties just copying over its again map framebuffer beta everything else that work on this and the version of a kernel hope that it's better but you don't use that sort of thing if you're doing this you don't risk a slightly working version released in the most something and of 4 maybe something that works but maybe not and then you have nothing to officials rather something broken and bad something it doesn't work at all you've probably seen this
before story before I start off quick then at 3 cube sitting there but not thought of rings someone movement so what you see here is just the PS scrolling path in about it and it's doing about 45 this point which is not too bad for no optimization whatsoever and there's something going wrong you because you sometimes see piles white but I don't know what that is in a small point trying to figure it out until the proper display that you get a proper this fun so what are we seeing here and we're seeing in 3 different programs 1 for the background 1 for the flat shaded and 1 pitch want with different pitch period and 4 different draws so background 1 drought-stricken Walter-Drop um hello verdicts count only there that's fought thousand 200 something what can I do already and so job control is is pretty much working um jobs here are um instead of with the old Molly you would submit a job for the verdict site and that would then they would have a command stream for 1 for the verdict shaded jobs that it would do that in turn in order 1 4 verdicts and 1 for Tyler so as 1 would come out of there will be distinct points in them and they would just say that powder engine now you can also do this and then you would run the fragment shader job at the end of manually from user space now um you set up a structure with 3 different jobs described the the Netherlands sequence 1 is the vertex shader command but never exceeded the jobless what it's called and then 1 is the actual Tyler where most of the most of the bonds and the terrorists taking all the geometry um thing in the actual right information and sticking out into Tyler buffer so it can be read up efficiently later on foreseen for 4 from the fragmented often and on that and then show of rendering and the final and tell which 16 by 16 with sometimes combined and sometimes not um so when a lot of unwanted get this in the kernel because it's all document of the and memory I can resize this as much as the ones I arm something special with this version which is called transaction elimination and so every tile 16 by 16 pixels and nuggets to C the 64 in hardware um calculated as part of the tiling probably very likely and ends if the Sierra 64 matches the previous tile then the memory is not sent to the final round of the framework of it's everything the same memory bandwidth so 1 point I saw uh a bit of memory that was a exactly the number of tiles lots uh which was like and each of the times 8 so I assume that it's that because of horizon don't have sort of information and last time on the previous Molly series uh and there was it was very interesting that there will be addresses for the each and of individual how especially formed with a worker so that it would be a cash efficient so that it would be have a locality that the panels would in memory next to each other just as they would sit on the screen um I hadn't heard that since I don't have a proper right background but 1 of so the units and told me about it and you have got that's all that 3 years so in the 1st time I sort of you know what I was a bit slower because at this time you don't even have that he just the right amount of memory and all all that for you can we sizes um I can run on the outer is but this demo is just all the draws are just almost directly I don't know how to I haven't programmed up how to get the uniforms and uniform buffer so all these rotation matrix as I just noticed that this position and this part coded that now it's just a bit of busy work um typing as said before is also not the and what of writings look like what attributes look like where memory this they should be ordered and and final model works because I have to change half of the original binary binaries for the shares that came out of the dumps they always had the positions for the attributes the attributes is for instance the verdict saying that and the picture corn these were always rewritten here afterward it's something that on for some reason likes to do new MDS format so the binary amount the Molly binary data format actually this is how do we write this down at this location you have there is that despite as the position of this uh after viewing a vertex data or the corner why you wanna rewrite is I don't know because where the attributes of listed that's also nice table if you alter the ordering that table you get the exact same result animal maybe I think it's fun nor faster that that's pretty much all I have to show 1 of the things you then this is about 13 thousand lines of code this demo is only 2 . 6 but all pasta and everything else we so let me show you my find my final slide
there's just again to show that I actually have 1 in there was that later so
question the 0 1 more thing that I
the this programmers can ring to do about so
and so doing what on 26 if yes OK it's gonna doing 26 of years it will always doing that it must be the weather here in Belgium because it was always doing 45 and performance on properties the what what is the the the red and so it's supposed to run this is this the for loop were 100 thousand times but I know when I'm better run for a few more than half an hour or so very crashes and always crashes that frame 17 17 down 79 thousand 530 and I have had on the train all the way over um I just sat there for 20 minutes we the finally have this this 1 frame that come through and so 21 the it's usually always on their mind it at least it shows my class on the 1st time and so 79 thousand Coventry I don't know why it because the commands he looks to say something my job submission is going wrong and all find out but I have so yeah questions the concerning the that the way the air hardware bugs no I know that there there were wanted to be feared that the ship this and what no I had blamed the kernel and with the that the like with HC my driver no other than the current this is lot of yeah like I because it's the Mali 18 and the last 1 and so we we get there's um 2 of us if you were of putting words coding encoded went and talked to on the back in the middle where 2012 we went and talked to our weren't interested and so if you had 2011 we talked are and they weren't interested in the middle of 2011 and shortly before we came here I came here in 2012 we talked arm of it again and the only thing they that they could come up with the original name was the that we just call them all here the the model in the remind later my was the name and re Molly because if you have the library that I have been developed which it was a bit of a framework so I could write tests almost directly for the hardware which is something I'm not doing this kind of time adjust the parser and then I'm going to Malaysia almost directly well whenever I I I get to work on the 2nd and so the Molly driver if there was of prepended with live it was a really nice home the really nice but what onset the that is we have a trademark and no 1 you do so ordinary brainstormed of bits and we just call them the so this is drivers completely the same as a completely different and it's called a molecule you just call the tunnel it's the only work with those letters that uh actually people love it's a language it's in areas well all of the world that the but the not it's nothing personal adjust just cold but what we want no it hasn't they have a non-tonal and on internationalization yeah I'm not that crazy 1 of my good friends in Nuremberg is the needs creator and anything else tha but you cannot go no and before we get our goal here and it's been a number of unbelievable follows 23 livestreams again this year and know that they're not working at the of working for everybody it but it's a lot of volume that's going out and unlike last year all the videos will be available almost immediately sort of Fossum organization that and not a wonderful job so he's a for them now the the plan and thanks a lot 16 fps at home you were so well behaved the bottom it's line