Healthy code for healthy teams (or the other way around)
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 131 | |
Author | ||
Contributors | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/69452 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
CodeGoogolAnalog-to-digital converterString (computer science)Keyboard shortcutModule (mathematics)ScalabilityType theoryShape (magazine)Population densityGoodness of fitProjective planeCoefficient of determinationMultiplication signCountingBasis <Mathematik>Formal languageTelecommunicationRow (database)Keyboard shortcutBuildingOpen sourceSign (mathematics)PrototypeJava appletCodeNumberSoftware frameworkPower (physics)Category of beingMilitary baseRight angleWave packetEndliche ModelltheorieBitWorkstation <Musikinstrument>Hydraulic jumpPosition operatorRule of inferenceComputer fileAxiom of choiceMathematicsDifferent (Kate Ryan album)Performance appraisalSoftwareScaling (geometry)Fitness functionPerfect groupBimodal distributionCAN busSoftware testingProcess (computing)Software maintenancePixelComplete metric spaceSocial classInheritance (object-oriented programming)Computer animationLecture/Conference
09:39
Module (mathematics)Type theoryScalabilityShape (magazine)Population densityGoogolBefehlsprozessorNumberMultiplication signType theoryEndliche ModelltheorieDifferent (Kate Ryan album)RankingPower (physics)Projective planeBitSocial classComputer fileParameter (computer programming)TelecommunicationFluxGoodness of fitView (database)Shape (magazine)Java appletPoint (geometry)CodeIntegrated development environmentVulnerability (computing)Message passingBimodal distributionModule (mathematics)Lipschitz-StetigkeitService (economics)Group actionFitness functionOperator (mathematics)Position operatorControl flowExclusive orSign (mathematics)IntegerLimit (category theory)Wave packetFocus (optics)Computer animation
18:48
MathematicsFrustrationMultiplication signType theoryRight angleComa BerenicesLibrary (computing)Logistic distributionInformationIterationCodeComputer animation
21:33
Object-oriented programmingPhase transitionBlock (periodic table)Broadcast programmingData structureGoogolScripting languageTime zoneNetiquetteMultiplication signLattice (order)Coefficient of determinationSign (mathematics)Block (periodic table)Type theoryMathematicsScheduling (computing)InformationCuboidDecision theoryFigurate numberNetiquetteSelf-organizationChannel capacitySoftware bugOpen sourceRun time (program lifecycle phase)Spring (hydrology)Game theoryFocus (optics)Uniformer RaumHacker (term)Time zoneData conversionShape (magazine)Set (mathematics)Range (statistics)Scripting languageComputer animation
26:22
DataflowTensorShape (magazine)Type theoryRun time (program lifecycle phase)RankingDimensional analysisCASE <Informatik>IntegerPhase transitionMedical imagingMatrix (mathematics)1 (number)MultiplicationOpen sourceWeightArray data structureMultiplication signShape (magazine)Wave packetBitDataflowAlpha (investment)AuthorizationLibrary (computing)Cartesian coordinate systemString (computer science)TensorStapeldateiSymbol tableContext awarenessReduction of orderFunctional (mathematics)NeuroinformatikClique-widthStandard deviationTouchscreenNetwork topologyNumberToken ringError messageMathematicsExpressionOverhead (computing)Order (biology)Position operatorComputer animation
32:40
GoogolComputer-generated imageryGlass floatPhase transitionEndliche ModelltheorieType theoryLogic gateEvent horizonDecision theoryTelecommunicationPersonal digital assistantMultiplication signStapeldateiInformationSet (mathematics)Point (geometry)CodeSoftware bugSelf-organizationDataflowSocial classStructural loadSlide ruleDemonEndliche ModelltheorieType theoryError messageFunctional (mathematics)Decision theoryLine (geometry)Open setProcess (computing)Spring (hydrology)Open sourceGoodness of fitUtility softwareCASE <Informatik>System callMessage passingProjective planeAliasingMathematical analysisRight angleSound effectInterior (topology)TelecommunicationComputer animation
38:57
GoogolEuler anglesMultiplication signNP-hardSoftware repositoryProduct (business)CodeBasis <Mathematik>Core dumpAreaSoftware testingUniversal product codeMathematicsStability theoryProof theoryPhase transitionCASE <Informatik>Military baseComputer animationLecture/ConferenceMeeting/Interview
41:22
Dynamic random-access memoryOrdinary differential equationInheritance (object-oriented programming)Block (periodic table)Similarity (geometry)Range (statistics)Fuzzy logicShape (magazine)Machine codeMultiplication signType theoryLine (geometry)System callArray data structureLecture/Conference
42:47
CodeMultiplication signExclusive orWave packetRule of inferenceWhiteboardShape (magazine)Projective planeProcess (computing)Goodness of fitMatching (graph theory)Military baseLecture/Conference
45:29
Design of experimentsLecture/ConferenceComputer animation
Transcript: English(auto-generated)
00:04
Okay. Here we go. Good morning, EuroPython. How are you guys? I'm very grateful that you woke up this morning and came back here, because I know that it is a long conference. It was a social yesterday, which I hope you have a blast. But yeah, I'm very grateful
00:28
that you made it and you're here. And you know people say you must start with the things that you love? I think it's not true. I really love to hate. I'm a hater. I'm the biggest
00:41
hater. And I'm going to prove you that I'm not like this tiny weirdo who hates things. It's like I think you all are haters. Because hear me out. Who doesn't love to hate their codebases? Who thinks their codebase is perfect? There's no technical depth. They
01:02
think it's perfectly designed. And everything is peaches and rainbows. Show me hands. Like am I wrong? Who has never really oh, no. People who work in good maintainer process projects doesn't count. Okay. Despite these folks, for the rest of us, particularly
01:27
for us then working research and codebase is a little bit wobbly, we're going to talk here and how to create a codebase that is sustainable, that the changes are maintained, but most importantly, that every time you open, it doesn't give you an aneurysm.
01:45
And who am I? I'm Mike Jimenez. I use she, they pronouns, and I'm a senior research engineer at Google team mind, which is sometimes a little bit weird topic, but like my position, which means being a research engineer means I'm half time a researcher, half time an
02:05
engineer. I try to build good codebases for the researchers, and I'm a researcher myself, so sometimes if there's a shortcut that I need to take, I understand. But I overall try to make things nice and pretty. I've been working in large language. I was working
02:25
in language models, which means in this day and age, I work in large language models. I'm one of the trainers and evaluators of Gemini, and currently I'm focusing on multimodal. And this is going to be important. Like, I used to be a tax people. I'm now a multimodal
02:43
people. And in my brag file, it says that I'm one of the original designers for the Gemini codebase. So that means that some of the design choices that we made are fueling the biggest larger language model. We don't know if it's the biggest because we are
03:05
not releasing things and no one is releasing, so it is at least fueling the most capable language models. And you might think, like, oh, you might have perfected a framework. The infrastructure, the sign dogs were great. Well, actually not. The good codebase starts
03:27
with people who are open-minded to understand, like, the things that they did in the past might not be useful for the future, and they are willing to try new things. They are kind, humble enough to hear everyone and don't cling into a solution. And honestly,
03:46
that's the way we build this very strong codebase. If we want people to be motivated and engaged with our changes, we need to have a mission. We need to have a Northern
04:04
star, then we follow all the way. So even if it sounds cheesy, even if it sounds weird, you must have a mission. If you come from academia, as I come, I used to be one person orchestra, like being the sole contributor to my GitHub repo, I was writing
04:27
my research grants, writing the papers, writing the experiments. So I didn't have the training when I was studying to actually understand that this is a team effort. But we cannot do
04:40
software, but more importantly, we cannot do research solely. This is a team sport, so we need to rely on each other. So your teammates are the people that you're going to rely on the most, and they are going to make things worth it. And I say I'm the biggest hater, but honestly, this is a love letter. This whole talk is a love letter to my teammates
05:05
and the thing that we build. It is not perfect, but it's been a blast to build it with them. And not on scale. I thought this was very common, but a colleague told me it was not.
05:20
But there's this rule that a team should not have more people than you cannot fit with n pixels. I heard n being two. We were joking that maybe American pixels are different than European pixels. They're massive, so you can have a 10 people team. Despite the number, I don't think there's a shortcut for reality. I think there's projects that require
05:43
more or less people. Basically, you need as much hands as you need to code, but the number of people that you have in the team, that should not become a bottleneck. So you need to have a strong communication between all the team members. You're growing
06:01
in unison. You need to sing that drawing. And for me, every time I see the slides, it has a sound. But you're creating a party, and who should you invite to the party? I think there are three main things to consider when you're building the team.
06:23
And this is not a skill base. It is mission-related purposes. But basically, you need to ask yourself three questions. First of all, who is a good fit for the team, and who will help to fulfill your purpose? This doesn't need to be someone who has a perfect background and aligned completely. I was a tax people, and now I'm multimodal people.
06:47
But my alignment with the mission is complete, so I can do both. And this is something that is very important. If you have people highly motivated who aligns with your mission, that's what you started with. They will be amazing teammates. And I think we should not
07:08
compromise here. I hate the... Oh, but they are great engineers. They are terrible to work with, but they are so good. Or the opposite. They are so kind, and we need to give them. I think
07:22
we should not compromise. If there's right people with the right alignment, and you can mentor them into being the best engineers, then all the best researchers of the team. And there's the opposite. Who is threatening the project? Who are the people who might want to join because your project is very flashy? As you can imagine, I work in a very flashy project,
07:46
and people want to join us. But if they have a different research agenda, if their timelines doesn't align with ours, it is unfortunately not a good time to join.
08:00
And it's absolutely fine. The train keeps going, and there might be another station when people can jump into. But you need to... Since you have the mission, you need to be very well aware of who should or not shouldn't jump into your train. And then there's this third category of who is in the project, but there's no specific reason. And I put this in this category,
08:25
like advisors, the sponsors. They are not day-to-day touching the code base, but they have a power to make or break your amazing changes. So get your best advisors for your code base, because they will open the doors, they will advocate for you. There's places
08:44
where engineers we cannot enter, but your sponsor, your advisor can. And what is the problem that I'm going to talk about? Gemini has been around for a very long time now, considering the very fast pace. It feels like ages ago. So there's things that have gone, and then things that
09:06
stay. I'm very proud to say that the things... This tiny thing that we built is still standing strong. We all love Python here, right? Python's flexibility is absolutely great for experimentation. There's nothing that brings me more joy than if I don't know
09:22
something. I bring up a co-op, and I can test it with any blockages. I can test things super quick, super prototype. I used to love teaching with Python. I teach Java and Python, and I dread the Java classes. I'm so sorry, Java people. But I really love teaching Python. However,
09:46
I'm working in a very big code base. If you see the Gemini paper, you can see how big the team is, and that unconstrained flexibility can become a challenge. And I have here a tiny bit example.
10:02
Don't worry at all if you don't understand this. It's a beginner talk, and it's really a beginner. This is a simple MLP written in flux. That's the thing that we use internally. But I want to ask you, what do you think is the type of X? What is X?
10:29
Yeah. I will know. It's basically impossible to know. You know that it's an array, but what's the shape of the array? What is the type of the array?
10:43
And I might hear you, like, oh, but Python has type annotations. My used type annotations. Sure. I know I could do it better, but for the sake of the argument, let me do this. It's better. Now we know it's a JAX non-pilot array, which is great. But what is the type
11:03
of the array? Where is the rank of the array, basically? What are the shapes? And now we might restrict ourselves too much. Like, this works with a JAX non-pilot array, but it also works with a non-pilot array. So maybe that's not what we want. And the problem is, like,
11:23
this is absolutely perfect when you have text. Everything is an array of integers. But when you have multimodal, then everything becomes tricky, and you're trying to twist the codebase to actually allow multimodality. So what we set ourselves to do is a good type
11:43
annotations then will allow us, like, if I open a codebase, if I open any file, any model, I know exactly where the shapes of the operations, where the more or less having a good documentation and some points. So we have a purpose. We have
12:03
a shape of the team. And I will do a disservice to this community if I don't talk about diversity here. Because honestly, diversity will be what makes or breaks your team and your project. It's not like, oh, I want to do it because it's the right thing to do. But even if you
12:23
don't believe it's the right thing to do, you're going to fail. And maybe controversial, but good exclusion leads to diversity. And what I mean by that, I love working with my mates. I absolutely adore hacking out and playing around. But that means we became a clique,
12:42
and it's really hard for people to join in. And it's really hard for me to join something else where I could be a good fit. So you need to be very well aware of the group that you make and how to encourage people to join. And I know it's 2024. I should not make
13:00
this argument anymore. But for people, for your friends at home who still think, oh, they were a diversity hire. They are not that good engineers. There's ample evidence than this. Diverse themes lead to better outcomes. Diverse themes are more fact focused. They
13:21
challenge each other. It could be somewhat challenging at the beginning because we came with different backgrounds. But we end up being way more efficient. And even if you don't believe in anything of that, if you want to be better, if you want to be more innovative,
13:41
you need to have more diversity. But diversity cannot be a lip service. You need to be aware all the time of the things that you do and the things that you communicate to make people feel welcome into your team. At the end of the day, you want to bring the best engineers and the best researchers to your team. And that means being extremely mindful
14:04
of your diversity policies. I work in a very fast paced environment. So it's really easy to forget, like, oh, we are great. We can do great. But that's not the way to go.
14:24
We need to show vulnerability and being open to listen to everyone. I am going to drill this message through the whole talk because, honestly, that's what brought us to having these great models. You want people to be able to challenge your views.
14:41
You want people to be able to tell you you're wrong. I'm very grateful when someone kindly tells me I'm wrong about something because I learned something. And there's a lot of evidence of companies who have failed to do this and they have deluded their technical expertise and they will not succeed. And then a point about cultural differences. Spanish, which means
15:07
I'm intense. We have a meme that is gasping in Spanish. I don't know if it's very open to everyone. But, yeah, we have a meme that is gasping in Spanish. So that means different cultures might have different tolerance to confrontation. So it's good to remind yourself,
15:25
like, if you have built a very diverse team, you have bring people with different cultural backgrounds. And even within the same cultural background, they might have diverse communications styles, different ways to approach things. And there will be times where you're in
15:43
a tight timeline and you need to remember this to be effective. So you have your team and you need to have a lead. Leaders are, well, I don't think I should argue here that we need benevolent dictators. But creating a code base is an exercise of power.
16:12
We are challenging everything. So every time we create something new, the things that we did before should be challenged. And unfortunately, there's a lot of things that we can decide
16:24
with data. But there's a lot of things that we cannot. And if you're a lead, you need to be able to allow people to express themselves, give them power, and move the project along. And, again, listening to everyone, if you have brought a very diverse team of very passionate
16:43
engineers, very motivated, very aligned with the mission, you're going to have loud engineers. I was brought here to this specific project when we were designing the code base because I was complaining a lot. And my lovely lead told me, hey, you complain a lot. Maybe you want to fix it.
17:01
And I definitely wanted to fix it. But that means there will be people louder than others. And if you have any privilege, and I want most of us, some of us have privileges and others don't, use those privileges to allow other voices in the room. And you must allow this agreement.
17:23
But you can disagree for a limited amount of time. You need to move the project going. You need to... The train needs to keep going. So is your position as a lead to keep things moving? And before I move on, a tiny note on how to influence. Because honestly, I feel like
17:43
80% of the material that I read is either toxic or harmful. So let's give it a minute to talk about how to influence. And I said influence with kindness. And kindness sometimes is a little bit of a unwaverly concept. We use a more unwaverly concept,
18:04
that is Googliness, but it should not. Kindness is clear, and kindness is unclear. That's a Brian Brown quote, but I will wait for it. It's like, kindly leadership will correct you when you're wrong, will help you connect with the engineers, will keep the project moving.
18:20
But most important, kind leads will lead the project for the benefit of the project, for the benefit of the mission. They will benefit at the end, because they will be successful. But obviously, that's not the goal. The goal is to move the project alone and bring the project into fruition. And obviously, these projects will not have been possible with
18:46
the amazing lead that we have. And if you're as eager as me, you're not thinking, okay, we have our mission. We have a problem. We have the code base. We have the team. Can we start coding now? Please, please, please? Fortunately not. I've been speaking 26 minutes
19:04
and still cannot talk about what is the actual solution. Because the actual solution will fit writing once you have all the other pieces in place. And you should not start a solution with the logistics. A new change doesn't start with a PR or with a design doc. It starts with
19:28
gathering information. What was the other solution that people tried? Obviously, we don't want to reinvent the wheel. We want to use a wheel that people have used, understand where the frustrations with the wheel, where are the things that people want with that wheel.
19:43
And here's when you need to exercise your soft skills. And every time someone says soft skills, and it sounds like they are doing the air quotes, I cringe. Because what I hear in my tiny brain is they are trying to undermine the skills and people socialize as women have.
20:05
Soft skills are not easy at all. And you can build your more flamboyant code base change. But if you don't have the soft skills to get buy-in and to get people to use your changes, you will fail.
20:21
Absolutely, you will fail. And I start with a why. Like we said before, right? I'm standing here very proud. I'm very happy with the things that we did. Then across multiple iterations, the changes on the type-in has maintained. Not because the actual solution that we built was the best. It's because it's the solution that made our users
20:44
happier. I will be as happy as I am talking about any other library. But we got buy-in first. And I like to remind myself this phrase. Something like that means you might win,
21:04
but you cannot convince. And this is something that when you have a very tight timeline, this was a very small team with a very skilled engineer, we could have overwhelmed the code base changes and say, well, friends, this is what we're going to do. It is done. But two years,
21:25
three years from now, I could not stand here and say the thing that we built are still standing strong. So you need to convince, you need to buy the buy-in. And now it's time to do the research. So you have gathered a lot of the conversations on the sidelines.
21:41
You have understand who are the people who are going to use this, who are the people who have used different solutions in the past. So now it's time to do the research. And do you want to time block this thing? We have a saying that people are docked people, called people. And we need both. But we need to time block
22:01
both. We need to spend the time doing research, and then we need to execute. We are going to make wrong decisions or okay-ish decisions because it's imperfect information leads to imperfect decisions. It is fine. Make your pace with it. And you got your A team, right? Now it's time to allow them to bring their A game.
22:26
And how to bring their A game? I'm very happy that EuroPython this year has this strong focus with neurodiversity because it's a very important topic and very dear to my heart. But you want everyone to be able to contribute to your meetings, to your
22:44
research in the same capacity. So if you need to meet, set an agenda. If you're meeting for no reason, there's no reason for you to spend time on a meeting. Send preparatory material in advance. So fill it out where the preferred ways
23:02
and people like to digest the material and how much advance time. And then schedule meetings in a time where everybody can participate. This might not be common for everyone, but well, it might be because a lot of open source is very distributed. But if you schedule a meeting at 11 p.m. London time,
23:25
there's a lot of people who are not going to be able to make it. So if you want your A team pool to be as big as possible, don't restrict yourself doing this kind of things. And finally, this is actually the design doc for the changes that we made.
23:44
Obviously, it was not typing. There's some things I could not share. But it was even before we were Googled in mind. We were just in mind at the time. This is the setup that we... This is the goal that we make to our set. We want to enable type annotations and runtime checking for jacks. We use jacks.
24:06
And we should care about shapes, D types, RA types, and value ranges. Value ranges at the end of the day, type, but it was in our motivation. And I put it here. I didn't hide it, because sometimes you set up a goal that is maybe
24:22
too big for you, and you have to unblock both things. You need to time block your research, you need to time block your experimentation. And I'm very happy about the time of this talk, because I can talk about what is
24:41
doing a hackathon. Hackathon is showtime. I love doing hackathons. It's particularly because my team is extremely distributed, so sometimes you forget that people are actually people. They're not trying to mess with you. They care a lot, and they're very vocal, but when you sit together, hack together, it's... Well, I really enjoy it.
25:02
It's really, really fun. But you need to understand them. Venues come with the scripts. It's not the same hacking in the basement on your uni, in your home, with your friends, having beers. You might be having beers. Or in the sprints that we're going to have here tomorrow, or tomorrow and Sunday,
25:23
or in this lovely London campus of Google. Hackathons take people out of their comfort zone, and honestly, it's very daunting the first time that you go to a hackathon. So if you want to be welcoming to newcomers, be extremely explicit.
25:42
So for sprint organizers, tell the people that are going to participate, what are you going to do? How are you going to triage the bugs? How are you going to involve people? The more information that you give, the easier it's going to be for people to decide. I actually want to come here and participate, because you have all the information.
26:06
Implicit is better than implicit, so make the etiquette obvious so people can feel welcome. Otherwise, you will end up having this tiny club kids club where people will not feel welcome to participate. And now, okay, so this was like the hackathon time.
26:28
It's time to put the medal to the pedal, and honestly, the hackathon was super helpful for us. We end up using Jack's Typing. We use Jack's Typing. There's multiple solutions, but we decided to use Jack's Typing because Patrick Pritter, the original author of this
26:44
library, the maintainer, was working at Google at the time. So in our pregame, we talked to him and we discussed things that could help, the things that were missing that we were going to implement, how we were going to collaborate with him into open source changes.
27:03
And Jack's Typing provides type annotations and runtime type checking shapes, and they type for Jack's arrays and PyTrees. You don't need to know where this PyTree is. And the food note is very big in the big screen. But it also supports PyTorch and
27:22
NumPy and TensorFlow. That was not a requirement for us, because we were very clear what we were going to use. But if it is for you, please be aware that this supports also your other ML libraries. So how does Jack's Typing look like? So Jack's Typing annotations are compatible with time
27:49
checkers. It can work with either type guard or bird type. Type guard is slightly more thorough, so that's a thing that we end up using. And it has this lovely shape. It's very easy
28:03
to understand, to read, so people were very happy with this solution. It is more verbose, but it is worth it, because if you're going to run a computation that takes weeks, if not months, you probably don't want a runtime error in the middle of the
28:22
computation two weeks from now, where you have farmed a lot of TPU or TPU time. So I argue that this verbosity is very, very helpful. And it's simply the type, the array, and then the dimension. This is a matrix multiplication, which means
28:44
we are going to have dimension one, dimension two, and dimension two should match on the other matrix. And the outcome is dimension one, dimension three. So it's really easy to understand where the rank of arrays, where the type, in this case, we know that the precision is both 32, which is absolutely amazing.
29:04
And what D types we have? We have shape, bool, integers, and floats in all its positions. And this is where it looks like. And you can see here how text is lovely. And then when you go... Because text is like... It's an array of integers with shape, number of tokens.
29:26
And it could be batch or not batch. But then you have images, and it's like... Well, we have hide, waste, channels. It can be batch, it can not be batch. And we can even define something like MNIST, which is a very popular ML dataset. It has integer images from 32 times
29:47
32 and three channels. Which is great. Honestly, we did not end up using that thing, because it was too restrictive for a use case. And the symbols. How to define the shapes of the arrays. We can put an integer. Here we can know an image has three channels. It could have
30:06
four if you have the alpha channel. But normally all the images that we work with have only three channels. We can say... We can use a string. We've seen before dimensions.
30:21
We end up using 8 width, because we know height and width stands for. And you can do symbolic expressions. You can know at the end, this is a function that reduces the size of the image, and the reduction function is times 2. So you know then you're going to end up with an image that is like half the height, half the width. And it has modifiers. These are not all the
30:48
modifiers that JAX typing supports, but the ones that we end up using the most. I sneakily already showed you the asterisk. The asterisk indicates zero or more. If you're familiar with JAXes, it's exactly the same. We can have an image that is batch or not batch.
31:05
So we can have one or more dimensions. And then sharp indicates zero or one. And finally, the underscore tells the JAX type checker in order to access is not going to be checked. JAX typing is a runtime check-in, which means checks are checked exclusive to
31:27
tracing. Pre-training people will have murder. I will not be standing here if we will have increased the budget for training in a tiny, tiny bit.
31:41
So this was a solution that has basically no runtime performance. You can say, sure, that's a little bit overhead. When you do the tracing, you need to verify the checker, check the types, but that's worth it. If you start running and suddenly, two weeks in,
32:02
you have a time error. You don't want that at all. And since I said we wanted to win, we don't want to. We want to win by convincing people. We set up this three-phase attack. So the first stage was defining the standard types. We can here have the image,
32:24
the MNIST image, the MNIST label. Again, this is too narrow for a use case because we end up having too many things. But we first annotated all our arrays or standard types. We did some
32:43
analysis. For example, we didn't actually care for NumPy array or JAX array, so we created an alias for that so people could start using it. Then we annotated our models. So models and data sets were annotated later. Still no type checking is happening. It's just documentation
33:04
for now. And you can see this is a function that loads the data set. This is a batch class, and I should have it done. Sorry. Then loads the data set, and we know then end up being a batch of things. And finally, we perform the time check. So we implemented this utility decorator,
33:29
which basically allows to time check the functions. And the good thing about this solution is we can decide if we want to type check a function or if we don't. Maybe sometimes we want flexibility, and we're right by them because we like the flexibility that it allows.
33:46
So we wanted to be very mindful of allowing people to be as flexible as they used to be. And this last line, the non-magic thing was our requirements. Then we got it from our users.
34:00
Erin and I were super excited of doing like, oh, but if we know it's an int and you expect a float, you can do casting, and we can do broadcasting internally. Our users didn't want that because what happened when we tried to do that is debugging was really, really hard. So instead of investing time or implementing these things, we invest in time or implementing
34:21
solutions that our users actually needed. And our users were also us. So we didn't want to implement things and were not useful. We ended up implementing better messages. Erin and Patrick worked to open source that solution and better data classes because our call to base is absolutely real with data classes. So as you can remember, getting information
34:45
from your users is critical to have a successful project. And as all the good things, everything should end. And you must accept that everything has an end. Today is the last day of your Python, and I hope we all have a very fun time. And also, there was an end of the
35:07
jax typing and call base design. That wasn't our main goal. We just wanted to build something to keep working on top of it. So this is sometimes triggering for people. But you
35:22
must accept that all the code that you build will become at some point technical debt. And I'm hoping like maybe in one, two years, the kids have figured out a better way of things. And I will get a PR from people removing my code. And it would be absolutely fine. I'm
35:41
very grateful for the things that we do. And we want to empower people to have the best solution for the things that they need to do. And on note on empowering people, we have this tiny but open chat where people can join and ask questions of how to debug this crazy thing.
36:01
And having 28 members is actually a good thing. I'm very happy about that. Because that means people who use our code base and they are not directly in Gemini are very happy with the solution. And they don't need to join and ask us questions. And we wrote internal
36:21
documentation of how to use it. Because we implemented things for use case, then it cannot be open source because no one cares about us. And I'm very grateful for the time. I said that this was a love letter. And it's a love letter. And personally, I want to thank Gabe and Aaron. They were absolutely brilliant. We still work together. And every time
36:43
I have the opportunity to work with either of them, I'm incredibly happy. Since I'm already here, I'm saying thanks. Thanks a lot to your Python organizers. We all well, I know how hard it is, thanks to the volunteers who take care of us so, so well.
37:02
And last but definitely not least, thank you all of us, because you have been here very early. And as the inspiration for this talk, I was reading this book, ironically, because I was thinking how to I used to be a conference organizer, Spain organizer, and I was thinking how we can do this more effectively. But it was at the time
37:26
of the first Gemini hackathon. So I felt like, oh, this is actually more or less the same. And this second talk, when I was preparing the slides, I was like, oh, gosh, I was actually very inspired by this. This was Titus Winter's last lecture at Google.
37:45
Well, Google, he's at Google. Titus is the reason for me to get my C++ readability before my Python one. I highly encourage watching any of their talks. You learn a lot. And since I like to end
38:01
with a TLDR, I ask Gemini, I paste all my speaker notes, and I ask Gemini, please, please, Gemini, can you summarize the talk? I'm too tired to do this. And that's what Gemini said. And this talk is about building a lasting codebase requires diverse, kind teams, united by
38:23
a clear mission, empowered to make decisions, and driven by empathy and open communication. And I think it did a very good job, honestly. Okay, this is the actual. Thank you very much to everyone. I think I have some minutes for questions. And if not, I'll be around. I'll be
38:43
in the sprints. So you can grab me at any time and ask me anything you want. Thank you very much. Thank you. We do have time for questions. Please gather around the microphones in the middle of the
39:07
room. So if you have questions, and this is just as small token of thanks from Europe, I think for coming. Thank you very much. Thank you very much. Do we have questions? Oh, there's one. Okay. Hey, thanks for the talk. I was wondering more in detail because we had
39:25
this problem a lot of having a very experimental research team and kind of like production code team. We really tried to make them collaborate in the same repo. It is really hard. Yeah, I really wanted to know your experience about like research code bases having maybe
39:45
some kind of research folder and then moving things over or like do like this whole mess of research for this production. Yeah, I feel you. I feel you. And I'm like, that's what I emphasize first. Like I'm a research engineer, so I wear both hats. We have two areas of the
40:05
code base. So we have an experimental, sorry, we have an experimental phase and stable thing. Things need to prove certain quality requirements to move from experimental to core.
40:22
To be completely honest with you, that doesn't always happen. You still need to be flexible with things. Like I got more call-ups with changes that I could admit. But still,
40:41
I think it's like achieving a compromise, like you want to run very fast. So you try to, it's a matter of empathy. Like researchers want to do the research, sorry, I keep moving. Researchers want to do the research. Engineers, we want to have clear, lovely code bases and have tests. And then sometimes it cannot happen. It is a very tricky
41:01
question. So like basically the answer is like, it's hard. Solve it one case at a time and try to explain your researchers' needs and your engineers' needs of the researchers. Thank you very much for the question. It was very useful. Hey.
41:23
Hi there. First of all, thank you very much for your talk. Super interesting. A similar fuzzy question, I guess. So you've talked about kind of creating this typing to help you with your coding. Yeah. Given that you have tight deadlines and you need to do the coding, how do you decide how much time and how much priority to put to the typing,
41:46
which isn't actually doing the work, but it's helping to do the work kind of. Yeah. It's being brutal with the time blocking. Like those things that we want to do, like for example, we wanted to have ranges of the arrays, like this is going to be this
42:02
shape and this shape. And it was impossible. It was honestly impossible to make it on time. And I think like having a very efficient team is good, but also understanding like it's not going to be perfect. And like you have limited time, limited, like I can, if I told you the type, how we made the tight deadline for the first time that we train,
42:23
it was completely insane. You have not compromised with your team members, right? So the people that you have, they are very efficient. They're very motivated and will try to do their best. But yeah, it's like trying to compromise. It's not easy. Just be brutal with the time that you're going to allocate to anything.
42:41
And so who is that, who makes that call? Is it the Benevolent Dictator that you mentioned before? It's the Benevolent Dictator. Because sometimes like every team member should be aware, but sometimes I am very passionate about what I do. And I want to do good research and good code bases. So sometimes I'm like, no, no, no, I'm going to push. And it's like,
43:03
I'm not sleeping for two days. And someone needs to tell you the toy is not up for playing anymore. So if you are the person who can say, oh, this is what I do, it's great. But if not, I am not that person. Rely on people who can tell you honestly, hey, you try, but this is
43:24
not the time. It will come. There will be a new, like the train is still moving. Yeah, cool. Thanks very much. No worries. Thanks. Hi, thank you for the talk. Could you give me a practical example to help me understand better
43:41
what you meant when you mentioned practicing good exclusivity to prevent clicks from avoiding other people to join or from preventing other people to join? Oh man, yeah. It is really hard because you have people who are super passionate about things and then it's really easy to form that click. Like we had a lot together and it's like,
44:07
it's really hard to onboard. So I feel like there's people around the engineering or the technical side that needs to tell open, you know, the rule in a conference, like you need
44:21
to have like a Pac-Man shape. If the team is able to do the Pac-Man shape on their own and allow people to join, it's great. If not, it's again, the job of the benevolent dictator to like literally the project that I am in, I'm a very good match for that project, but it was
44:42
very, very close. And I have someone who tell me, who talked to the team, talked to me, like you're going to be good for the team. And honestly, we've been working amazing for the last six months, but you need to have that external because it's really easy when you're in such tight deadlines. Onboarding people is not easy and like trying to understand where is the
45:06
alignment of the person, do they want to join because they want the star or do they want to join because they care? You need someone who do that pre-game work for you. Was it useful? Did they answer? Thank you. Thank you for the question. That was very useful.
45:26
If we don't have any more questions, thank you, Mai again, for coming.