We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The State of Production Machine Learning in 2023

00:00

Formal Metadata

Title
The State of Production Machine Learning in 2023
Title of Series
Number of Parts
141
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
As the number of production machine learning use-cases increase, we find ourselves facing new and bigger challenges where more is at stake. Because of this, it's critical to identify the key areas to focus our efforts, so we can ensure our machine learning pipelines are reliable and scalable. In this talk we dive into the state of production machine learning, and we will cover the concepts that make production machine learning so challenging, as well as some of the recommended tools available to tackle these challenges.
114
131
Machine learningLevel (video gaming)Product (business)Range (statistics)State of matterBitSlide ruleVirtual machineGoodness of fitComputer animationLecture/Conference
Link (knot theory)Machine codeSoftwareChainAreaCombinational logicIntegrated development environmentVirtual machineMachine learningSelf-organizationLevel (video gaming)DeterminantBitState of matterVideo gameSlide ruleoutputDomain nameField (computer science)Link (knot theory)AlgorithmMachine codeCASE <Informatik>Product (business)Element (mathematics)SoftwareTwitterMereologyCore dumpNeuroinformatikComplex (psychology)Endliche ModelltheorieLatent heatProfil (magazine)Moment (mathematics)Physical systemPoint (geometry)Right angleInformation securitySemiconductor memoryScaling (geometry)CybersexMachine learningComputer hardwareOutlierSampling (statistics)Graphics processing unitRevision controlStack (abstract data type)Sheaf (mathematics)Cartesian coordinate systemInterpreter (computing)Function (mathematics)Streaming mediaConnectivity (graph theory)Computer animation
Virtual machineState of matterSoftware frameworkSelf-organizationStrutProcess (computing)Computer architectureExpert systemBitLevel (video gaming)CASE <Informatik>CodecStandard deviationRoundness (object)View (database)Order (biology)Connectivity (graph theory)Table (information)Single-precision floating-point formatCartesian coordinate systemRight angleVirtual machineKey (cryptography)Touch typingQuicksortPoint (geometry)Dependent and independent variablesDescriptive statisticsMaxima and minimaPerspective (visual)AreaSoftware frameworkMachine codeDomain nameSet (mathematics)Profil (magazine)Software engineeringOpen sourceShape (magazine)TwitterPredictabilityDatabaseEndliche ModelltheorieProduct (business)Computing platformRegulator geneGroup actionArithmetic meanCombinational logicAlgorithmSoftwareData centerMultiplicationComputer animation
Associative propertyArtificial intelligenceRule of inferenceAssociation for Computing MachineryOpen sourceMountain passState of matterTime domainDataflowTensorSoftware frameworkSimilarity (geometry)Level (video gaming)TwitterBitVirtual machineRegulator geneGroup actionSelf-organizationChemical equationSet (mathematics)Software frameworkOpen sourcePoint (geometry)System callInformation technology consultingMereologyRight angleAreaForestProcess (computing)XMLComputer animation
Software frameworkSource codeVirtual machineArchitecturePoint (geometry)Wave packetShape (magazine)Analytic setSet (mathematics)CASE <Informatik>Virtual machineProduct (business)System callSoftware frameworkAreaGroup actionEndliche ModelltheorieState observerRight angleInferenceDemosceneObject (grammar)BitData managementLaptopIntegrated development environmentReal-time operating systemQuicksortMetadataMachine learningElectric generatorSoftware maintenanceComputer animation
Canonical ensembleStack (abstract data type)Abelian categoryArchitectureBuildingSource codeInformation retrievalFacebookData modelLimit (category theory)Endliche ModelltheorieSet (mathematics)Information securityRun time (program lifecycle phase)MultiplicationQuicksortMereologyStandard deviationTrailProduct (business)BitSemiconductor memoryStatisticsComplex (psychology)Physical systemConnectivity (graph theory)MetadataEntire functionInjektivitätEndliche ModelltheorieVector potentialVirtual machineInferenceSheaf (mathematics)Element (mathematics)Order (biology)AreaDiagramMachine learningEinbettung <Mathematik>Wave packetProgramming paradigmRevision controlInformation securityTheory of relativityDataflowInstance (computer science)Integrated development environmentScaling (geometry)Combinational logicTransformation (genetics)Projective planeDistribution (mathematics)SoftwareSelf-organizationTerm (mathematics)Computer architectureChainDatabaseLevel (video gaming)Free variables and bound variablesSoftware frameworkCycle (graph theory)Different (Kate Ryan album)Video gameFacebookArithmetic meanCASE <Informatik>State observerService (economics)Set (mathematics)Canonical ensembleRight angleBefehlsprozessorLeakWindows RegistryPredictabilityData storage deviceConfiguration spaceData conversionLine (geometry)Object (grammar)Expert systemInteractive televisionVulnerability (computing)Electronic data processing2 (number)Ocean currentBinary codeComputer animation
Information securityTime domainState of matterNumerical taxonomyPolygon meshArchitectureTime evolutionProduct (business)Computing platformExpert systemControl flowEndliche ModelltheorieIterationCASE <Informatik>SoftwareSystem callGroup actionComputer hardwareAbstractionCASE <Informatik>Domain nameTerm (mathematics)Stack (abstract data type)Endliche ModelltheorieSelf-organizationSoftware frameworkCybersexSystem callProduct (business)Expert systemConnectivity (graph theory)Group actionRight angleAddress spaceInformation securityContext awarenessSimilarity (geometry)Multiplication signSoftware developerBitSingle-precision floating-point formatTwitterMereologyDifferent (Kate Ryan album)AreaPolygon meshCycle (graph theory)Video gameMetadataSoftwareIterationComputing platformPhysical systemMachine learningLevel (video gaming)QuicksortPerspective (visual)Game controllerNumberSpacetimeData managementWave packetAbstractionComputer programmingScalability1 (number)Projective planeCombinational logicLine (geometry)Operator (mathematics)Standard deviationSoftware testingElement (mathematics)Complex (psychology)State observerMathematicsLatent heatReverse engineeringOpen sourceData analysisMarkov decision processEnvelope (mathematics)Analytic setVirtual machineChainMappingElectronic mailing listExtension (kinesiology)Disk read-and-write headShift operatorInferenceCountingMultiplicationMeta elementExploit (computer security)LinearizationServer (computing)Software maintenanceComputer animationXMLProgram flowchart
Group actionSystem callAbstractionComputer hardwareContent (media)Uniform boundedness principleNumberProduct (business)HierarchyMathematical optimizationMachine learningComputer animationLecture/Conference
DataflowChainBuildingNumberRight angleComputer hardwareProcess (computing)Endliche ModelltheorieRegular graphComputing platformProduct (business)Software frameworkEvoluteOperator (mathematics)Electric generatorVirtual machineDomain nameLevel (video gaming)Service (economics)Expert systemQuicksortSoftware engineeringSelf-organizationTerm (mathematics)BitElement (mathematics)Presentation of a groupVector potentialTwitterOverhead (computing)Perspective (visual)Formal languageCASE <Informatik>Transformation (genetics)Bookmark (World Wide Web)Computer animation
Product (business)Instance (computer science)Limit (category theory)Data modelComa BerenicesCanonical ensembleBuildingAbelian categoryStack (abstract data type)ArchitectureSource codeSoftware frameworkInformation privacyVirtual machineContinuous integrationTrailSoftware frameworkMultiplication signLine (geometry)NewsletterBitMetadataPosition operatorMachine learningElectronic mailing listSoftware engineeringData managementTwitterScaling (geometry)Software developerFront and back endsEnvelope (mathematics)1 (number)Workstation <Musikinstrument>Stack (abstract data type)Level (video gaming)Integrated development environmentSampling (statistics)Control flowScalabilityVirtual machineProduct (business)Endliche ModelltheorieString (computer science)Expert systemRoundness (object)SoftwareSet (mathematics)Computer animation
Single sign-onComputer animation
Transcript: English(auto-generated)
All right. So very excited for today's topic. This is a topic that I have shared in previous conferences, but it always keeps evolving and it keeps evolving really, really fast. So today we're going to be diving into the state of production machine learning
in 2023. There's going to be a broad range of topics that we're going to cover on a very high level because these have been topics that have been covered in previous conference talks. So what you will find is you will find actually references that will allow you to dive deeper into each of these topics. So
pretty much a slide will, you know, consume your entire weekend on a rabbit hole learning about it, hopefully for the good. A little bit about myself. So my name is Alejandro. I am currently Director of Technology at Zalando. I'm also a Scientific Advisor at the Institute for Ethical AI and Governing Council member at
large at the ACM. So as I mentioned, there's going to be a couple of links that you can dive into. So this is from previous talks that I have done, but also interesting references that you can check. So you can find the slides on that top corner. So yeah, please, you won't have to take all the pictures and keep them. You can actually access the slides. So what we're going to be
diving into today is five key sections, starting with motivations and challenges. Why do we care about production machine learning? Then we're going to dive into some trends that have been emerging in industry and also in academia and
in society as a whole. We're going to be then diving a bit deeper into the technology and then we're going to be looking at the organizational considerations themselves as well. So by the end of this talk, you should have an intuition of the very high level societal implications and considerations, but also how that reflects into the technology and then
bringing some insights that you would be able to not only add into your tech stack, but also into your teams and into your organizations hopefully. So starting with the motivations and challenges. So one of the things that has become crystal clear is that the life of the machine learning models does
not finish once they are trained. That's only the beginning and it's only when you actually start consuming those models and really getting value out of those machine learning models, and more specifically machine learning systems as we will see today, you'll start seeing challenges that you are not going to see when you are in your experimentation phase. You're going
to see things like outliers, you're going to see drift. From the moment that the model hits production, you already see a degrade that has to involve certain considerations. So as part of that, this is going to be one of the core principles that we're going to be fleshing out. But we
need to ask the question of well, why is production machine learning so challenging? What are some of the key areas that make it not only difficult, but also different to traditional software? So some examples that may be different to traditional software microservices are that there is specialized hardware. So when it comes to the productionization of models,
you have to involve not just special accelerators like GPUs, TPUs, but also in some cases, very large amounts of memory, right? For example, models that require a lot of RAM or a lot of VRAM. You also require perhaps special compute. And as part of that, that involves complexity in
the orchestration of your models themselves as you reach larger scale. There are complex data flows, right? It's not just about the model itself and the inputs and the outputs, but it's potentially the impact that considerations can have up the stream or down the stream, right? There are compliance requirements, particularly when it comes to machine learning. It tends to be also very, very
tightly closed into the domain use cases, which means that often there are a lot of compliance requirements and in some cases even ethical requirements when it comes to the productionization of machine learning systems. And then another area is the reproducibility of
components, right? It's not just the deployment of the application of the code, but it's that combination of code plus environment plus artifact, right? And then the versioning of that and making sure that you're able to have that reproducibility so that you can introduce that determinism into your environment, right? So if you want to dive deeper again, these are the links that you can find for talks that actually in this case talk
specifically about the challenges. But then going one level higher, why is production machine learning so challenging at even the societal level, right? And this is part of the point that it goes close towards that use case specific area. There are challenges that you
may have heard already, particularly in high profile cases in the news around algorithmic bias, right? Whether it is discrimination due to undesired bias within the models, and that in itself, again, you know, there is a very interesting field that you can delve
into to understand about that explainability, interpretability, bias. There is also the challenges that you have in traditional software, which is basically software outages, right? What happens when your actual infrastructure falls down? There's the misuse or the challenges that come with the data itself, and then there is an element of cybersecurity which we are going to be diving into and that it's exciting to see that there's a lot of topics that are coming
up now, especially in this conference. There are a couple of exciting talks on cybersecurity. And then of course, you know, it couldn't be a state of production ML without the LLMs, but this in itself, you know, doesn't really change the whole challenges that machine learning introduces, but I think you will see as part of the examples that I will be giving in
this talk, it makes them a little bit more intuitive, right? And it makes the need for this production machine learning considerations more clear. So I think that's the one thing that has been, you know, brought beyond the hype that we will benefit from. And this includes complex architectures that will be described as this data-centric view of machine
learning that involves multiple components, right? You have seen most likely when it comes to the world of applications on LLMs that you would see the use of machine learning models in ways that are very creative, right? The machine learning model interacting with APIs, interacting with databases, and then bringing the prediction together with that combination. So it will kind
of like give you an intuition of some of those challenges. Now, in order for us to tackle those challenges, it also involves some considerations of the skills required for those challenges, right? And this is something that has now become a little bit more ubiquitous and standardized
and understood is this intersection of the skills between software engineering, data science, and DevOps or platform engineering, which is basically the skill set of machine learning engineers or MLOps engineers. So this skill set is something that has now become even more prevalent within data science teams as their requirements and productionization
grows and we will actually touch upon that in the organizational shapes in a bit. But then there's also not just an intersection of skills but it's an intersection of domains themselves, right? So you have the intersection of the knowledge required within the machine learning expertise but also the industry domain expertise and then as we will see as well that policy
expertise, right, is how do you make sure that of course you're doing it correctly from a technological sense but also that you're aligned with the industry requirements and then aligned with the higher level considerations. But one thing that we will see as well is how to think about this, right, because right now it's very, very abstract. So for that, now let's
dive into some industry and domain trends, right? So we're going to go a little bit deeper, still very high level. We have been seeing that, and this is something that I really liked how it was verbalized by the Linux Foundation, is that we started with this description of AI ethics, perhaps in like 2015 to 2018. Then we went into responsible AI, which is basically okay,
let's discuss the higher level, then let's discuss the best practices. And now we're talking more about accountable AI, right? How do we actually hold people accountable, introduce whether it's through regulation, through policy, through standards, things that allow us to understand what best practice looks like and what should be the bare minimum in some
areas. So in this case, the way to think about it from a hierarchical perspective, you of course have those very high level principles and guidelines that give you that north star. But then from that, you have to get a little bit more concrete. What are those industry standards, those regulatory frameworks, those even organizational policies to be compliant
for certain requirements? But then similarly, what is absolutely critical is not just to have those north star principles, but it's to make sure that those open source frameworks that are now ubiquitous within industry and academia, they are also by design aligned for those principles,
right? Because in essence, we can have as many round tables as we want, and we can all agree that discrimination is bad. But if the underlying infrastructure and the foundation is not enabling this by design and it's not built with those considerations in order for them to be empowered, then we're not going to be able to achieve what we are setting out to.
And similarly, if we actually see it from an organizational standpoint, the thing to emphasize as well is that large ethical challenges or even large compliance challenges should not fall on the shoulders of a single data scientist or on the shoulders of a single software engineer, right? Because of that, it is not just required for individuals to be responsible,
right? Because also one thing that we have seen in the past is that you can have a situation where a group of individuals are all responsible or ethical, let's say, but then the outcome as a whole may not be, right? And that doesn't mean that people were just sitting there thinking,
how can I build the most racist algorithm that I can possibly do, right? And when we've seen that in the high profile cases in the news, I hope that that's not what the people are thinking, but most likely it isn't, right? And that emphasizes that it's not just about individuals, it's about the compound. So what that means is, of course, from a personal perspective,
having that sort of technology best practices, being able to work in the areas where your competence is relevant, having those areas of professional responsibility, like the ACM has code of ethics and professional responsibility that we always kind of recommend and point to, because it's also from the ACM. But then broader to this, you also have the team, right?
And how you make sure that there's that cross-functional skill set that balances each other, how you have the key domain experts, how you have the relevant alignment within that. And then at an organizational level, it's also important, how do you introduce the relevant touch points, human touch points, that ensure that the respective domain experts,
most likely in several cases non-technical, will be able to provide that human decisioning, right? And ensuring that that accountability can be distributed as opposed to just, you know, ensuring that there's just a single individual accountable. And of course, you know, one thing that has now become a little bit more real is, and that,
you know, it's really interesting to see, is that before we were talking about how regulation was playing catch-up, but now it is tech companies playing catch-up with some of the regulation that is being rolled out. So recently we actually submitted a contribution to a consultation for the European Union's AI Act, which is going to, you know, come to force in a
couple of years. So now actually companies are thinking, how are we going to roll this out, like how are we going to introduce the processes? Similarly with, you know, the other regulations in the EU, but we're seeing similar things in other parts, like in the UK. Again, we submitted for their current initiative for, they call it the Innovation
First AI Regulatory Framework, and for that they're also looking and thinking, how can we achieve the best practices being rolled out within the organizations whilst encouraging innovation, right? So it's really kind of having a balance of both areas. And again, this is to emphasize,
back to the point that I mentioned, but making more of a call to action to the people in this room, right? That of course we can have all of these, you know, regulatory frameworks, all of these principles, but really it is, you know, the foundation that now actually runs a large percentage of our society, these open source frameworks, not just machine learning frameworks, but also just general software frameworks that are required for, you know,
individuals like ourselves to be involved in this discourse, right? Because we are really guiding and supporting how these best practices or how these mitigations of bad practice are formed and rolled out in society. So, I mean, and the ways to get involved as well, I mean, ultimately
are as simple as just attending conferences like this, but also getting involved with open working groups like from, you know, the Linux Foundation or the ACM. So that's something that definitely would be a call to action for anyone that would like to get involved in those. So now actually let's go a level deeper into the technological trends and, you know,
some of the tools and frameworks that have been growing within the ecosystem. So yeah, I mean, just as a refresher, you know, back in the day, you know, this is how it started, right? Like simple, you know, you could just pick and choose and it was easy. But now it's a little bit harder, right? We have a very, very large tool set and I think since the rise of LLMs,
it's like one every hour, pretty much, right? So yeah, so the question of how to navigate, I mean, what I do want to highlight here is a bit of a plug of one of the frameworks that we actually maintain. So this is one of the awesome production machine learning frameworks in GitHub
and actually we are celebrating its fifth year and we just broke 14,000 stars. So, you know, our call to action is not to just, you know, go and add more stars unless you want to, but to actually add PRs and anything that is missing. We are a little bit more strict on what
is added now because, you know, otherwise it would be like enormous. But definitely if you're interested, like this would be a great also way for you to discover new areas, right? Like diving into tools. But yeah, so now putting a little bit kind of like into a shape, this set of frameworks, if we want to like get an understanding of what is the anatomy of production
machine learning, right? So the way that I'd like to think about it, so let's think about basically this sort of production blueprint where we have the training data, the artifacts, the train models, and then some inference data. We start basically on the experimentation where we
train models through that training data, doing sort of like whether it's a workflow manager or a notebook, ETL or just a notebook, to generate train models, right? From that, we want to do something with these train models. We want to be able to make them available for consumption.
So we would be able to do them either manually by, you know, publish our Jupyter notebook, don't do that, or by properly, you know, productionizing your machine learning models into an environment with basically in this case your offline or real-time or semi-real-time machine learning models. Ultimately also introducing observability and monitoring,
things like drift detection, outlier detection, and in production observing basically inference data, right? Like running inference on unseen data points. Ultimately with the objective to be able to make use of that inference data at some point, whether it's for training data or for
analytics, there are some relevant use cases. And of course, the metadata that actually interoperates around this is something that, you know, would make the whole picture of this, you know, anatomy of production machine learning. So one of the things that we have also been seeing is now looking at this as just an architectural blueprint, if we look at the
question of, well, okay, what frameworks can I pick and choose there? And what we're seeing is there has been a convergence into the concept of a canonical stack, right? It's basically having elements or sections or little kind of like placeholders that serve for particular outcomes,
right? Like your experiment tracking, your data versioning, your experimentation, your model registry, your monitoring, right? And as part of that, you have an ability to choose different frameworks. So now what we start thinking about is how do we then encapsulate those components and think about, in a way, standards that we expect frameworks to
be able to have so that we don't end up in a world where, you know, just everything is completely different and we end up kind of like with different standards that create more standards, right? And we're going to touch upon that. This actually is a very interesting tool that you can use to just, you know, pick and choose your frameworks. But now there's actually starting to be a little bit more convergence, which is interesting to see. Like they're starting to be a bit more preference to certain tools and
certain combinations of tools also depending on the scale of the projects. Another trend that we're seeing is that people are starting to also realize or, well, I mean, realize and also put a name to it that when we talk about production machine learning, we no longer talk about the production model, right? We actually talk about a production system. And what that basically means is that we stop thinking about this model-centric
machine learning and we start thinking about this data-centric machine learning, right? Is the question of how does your data flow? What are the transformations of the data as they go through your system? And of course, you know, here is an example of a architecture of the Facebook search. So I actually cut the, yeah, here's the diagram.
So here you can see that there's actually an offline and an online sort of section. So basically training the embeddings and then being able to use them. And you see that there's multiple stages, right? As part of this, there's not just going to be multiple machine learning models, but there's going to be multiple versions of those models, multiple relations
to the training data, and multiple components that are not machine learning related, right? So when we think about this machine learning system, it's important to understand, like, what does that mean in terms of intuition? Because when you look at something like this Facebook search, I mean, maybe it's a little bit abstract, this is what we can go back to what I was suggesting in terms of LLMs providing a more intuitive picture. And I think right now when
we start seeing this agent chain architectures of how people are thinking to deploy a LLM that then interacts with, let's say, another API, or that then interacts with a database. So that in itself is a, in a way, partially a data centric machine learning system where you are expecting multiple flows of interaction, right? And I mean, there is, of course,
you know, increasing complexity depending on how large is the system. But the way to also think about it is that each of these components will also introduce the challenges that we, you know, revised at the beginning and will benefit from the production machine learning considerations that we will talk about, right? All of this monitoring, metadata management,
every single component is something that you'll have to consider as part of your machine learning system. But yeah, so this is just another topic that, you know, you can spend an entire weekend, I mean, people spend their entire PhDs and careers just on that. But, all of these areas are definitely very interesting to dive into.
So another thing to take into consideration, so as part of this machine learning systems, we also want to understand what are some of the relationships between the components, right? And we, I mean, probably most people here have actually come across the concept of a model registry, right? Like an ability to be able to, you know, keep track of your
trained machine learning models. But when it comes to production machine learning, we actually introduce a new paradigm that has to bring new, sort of like, new considerations, right? And let's actually see that intuitively. Let's say that we have a data set, so instances for data set A, so we have basically all of these instances
that we then use to train a model, right? So we run an experiment, we train model artifact A1, right? We train artifact all the way to AM, right? So we have basically a data set that we are using different parts to train basically different models within an experiment. But then we also may have other model artifacts that come from different data sets.
That in itself is your artifact store, right? But then what happens when you productionize your models, right? You productionize your models, let's say you productionize your artifact AM, and you are productionizing it with certain configuration, right? Then you may actually productionize it again with, in another environment,
with a different configuration, right? And then you may actually productionize a combination of these models as a pipeline or as a data flow component, right? So as part of, you know, this introduces considerations that in your traditional artifact stores, you're not fully capturing. And as part of that, you do have to,
you know, make sure that those things are considered when it comes to your production environment, right? Because if something goes wrong with model AM, right, then something will happen as part of your pipeline, then something, you know, will have to be debugged and you'll have to consider with that second model, and you will have to understand kind of the whole picture, then bring in the relevant experts, you know, who trained model B1,
right? Who trained model AM? But yeah, ultimately this is just for an intuition, so hopefully it doesn't confuse you a bit more for the next section. And this next section is saying basically, okay, so we have multiple models in production that have sort of multiple considerations in this sort of machine learning system.
Now, as part of each of those components, you will have also to introduce the best practices around how do you keep track of them, right? It's how do you know when something goes wrong? And this is basically by introducing things that, you know, in traditional software would be just monitoring, right? And traditional software monitoring would be things like,
what is the request per second? Or what is the throughput of your service? What is the current CPU usage? What is the GPU, well, RAM usage, right? And suddenly it crashes, you start seeing the chart, and then you see that there's like a consistent chart, and then you realize that there's a memory leak, and then, you know, you go and address it, right? But in machine learning, there are further considerations when it comes to monitoring
for each of these components, right? It's monitoring of things like statistical model performance, right? Like what is the accuracy of your model in production? What is the production? And in order to answer those questions, there's an element of data labeling, right? Like you have to know what are the actuals in your production environment.
So that introduces basically the questions of how do you then, you know, bring that into your production environment and monitor that. There's also things like explainability, right? How do you make sure that whenever there's a prediction, you can explain what happened as part of that prediction? That in itself is another, you know, area of, you know,
research that has some really interesting approaches. And then also the question that we were discussing about, well, as part of your inference data, you may also want to get some insights, right? What are the distributions of your production data? What is basically perhaps use cases that you can bring into the organization from your inference data, right? So those things are considerations that go beyond the traditional monitoring of software.
And then similarly, how do you introduce observability on top of that, right? So things like SLOs, so service level objectives, alerts, SLIs for indicators, so that you don't have to just be checking on the model in the dashboard, but you can have, you know,
pagers for your teams so that they can be notified whenever actually there's a problem. So, and then finally things like, you know, drift detection, outlier detection that you can introduce as part of your stack. But again, so, you know, each of these areas is probably like, you know, a deep dive in itself that you can actually check out in one of the
talks that we gave last year on production machine learning monitoring, which is interesting itself. Now, another consideration to take into account is the challenge of security, right? So this is something that comes up a lot when it comes to traditional software, but in machine learning, it's not something that is discussed as much. So when you think
about security, the first question is where is security relevant, right? Like what part of the machine learning life cycle do I have to think about security? Is it on the data processing? Is it in the model training? Is it in the model deployment? Or is it on the monitoring, right? And the reality is that it's basically like across all, right? I mean, well,
that's supposed to be like a red line across all, but you can see it well. But yeah, basically saying like every single part and every single stage of your machine learning life cycle is susceptible to vulnerabilities, right? And it's something that now the community is starting to think and ask the question of, well, what vulnerabilities? What does that mean,
right? So as part of that, we actually have started doing some initiatives exploring the security risks of machine learning, right? And actually, I mean, if you don't want to get involved, we are running a committee as part of the Linux Foundation on machine learning security,
where we are really trying to explore what are some challenges within security. Examples of this would be challenges and risks on the model artifacts, right? Basically, potential injection within the binaries, which I think there's going to be actually a talk on that. Yeah, on poison pickles, I think it is. So do check that out. Challenges of accessing to the model,
being able to reverse engineer the models and exploit them, challenges of supply chain attacks, being able to inject in the dependencies, challenges within the infrastructure that the model runs. But ultimately, I mean, of course, this is a long running list. The main thing to think about in this specific area is that as machine learning practitioners,
there is an element of security that in certain extent, you don't have to 100% care about all the time and try to address it every single time, just because we can't make every single data scientist also cybersecurity experts. But ultimately, is to think about, from a holistic sense, similar to how it's introduced in the software development lifecycle,
right? So how you have traditional SDLCs that we will see in a bit. So yeah, so that's basically some trends when it comes to the security side. Now, as part of the last area, let's talk about teams and organizations. So how does this look like when it comes to my team,
to my organization? How can I roll this out if I was to bring this to kind of like my area, right? The first thing is that we are starting to see a trend where basically this concept of SDLC, so software development lifecycle, which like organizations tend to adopt and tend to roll out basically. Now you don't find organizations or not very common to find organizations that
stuff in production without an operations team, like a DevOps team, with CI CD, with testing, et cetera, et cetera, right? So, but when it comes to machine learning and this concept of the machine learning development lifecycle, it's a little bit different because you cannot just roll out something that is the same across every single use case, right? Because different
use cases will also have a different level of risk. And also different use cases will have a different tech stack, right? So machine learning, you know, teams may be much more analytics heavy, as in from analysts that are performing, you know, that are using
like analyst stacks, something that is perhaps more higher level versus something that is a little bit more machine learning engineering that involves productionization, that involves real-time inference. So there are different considerations at the tech stack level, but also at the domain level in terms of the risk, right? Some may actually involve heavy compliance
requirements, others may not, right? So when it comes to this ML DLC, it becomes more of a framework to adopt best practices that are relevant for the context. Now, as part of that, you know, we talked about a little bit in terms of the components, but something that we're doing as well as part of trends in the organizations is the concept of metadata itself, right?
Now we have different components, different systems, different frameworks, some that are doing our model versioning, some that are doing our model artifacts, some that are doing our model serving. How do we make sure that we have lineage across all of this? So that you know that if something went wrong in production, even at a compliance level, you can actually go back
into the training side to understand, you know, perhaps what is the linkage between both. And this is harder said than done, ultimately, because when it comes to metadata, you know, it also often involves standards to be able to ensure that it's homogeneous enough that it can be processed and that can be handed over in a way, right? So what we're starting to
see is that there's standards that are trying to standardize standards, right? So what we want is to also make a bit of a call to action to not do that, right? And try to contribute to potentially existing ones and try to, even as open source project leads,
if they are present in this room, to also think about how can I standardize and bring alignment into this broader ecosystem. But this actually in itself is an interesting one. And again, there's a talk that was actually two years ago on meta ops. So I mean, it sounds
more boring than it actually is. It's actually pretty interesting, or I'd like to think that, but yeah, do check it out if you're interested. And then finally, the last couple of things to mention is that, you know, people in the data space would have heard of this, you know, buzzword of data ops and data meshes. So we're now starting to see a bit of a convergence
between this MLOps space and the data mesh concept, which is basically thinking about even MLOps, not as a single sort of centralized data lake where everything just gets put there, but as something that actually interoperates and is closer to the domain. And that serves kind of multiple different domain specific expertise, but also has an ability to like
interoperate on that. So that in itself is another interesting area that I definitely recommend diving deeper, this intersection of data mesh and MLOps. Now, when it comes to the products themselves, we need to also start thinking and having a mind shift of machine learning as products themselves, right? Not just as projects. So ultimately them having roadmaps that don't,
you know, that involve that sort of feature improvement, incremental improvement perspective that has been ubiquitously adopted in the software space, right? It's really kind of seeing this machine learning initiatives as products that would also involve that product
mindset when it comes to that, like refining, delivering value, iterating. And that similar thing is also reflected into the mindset that we're seeing within organizations is thinking about their machine learning also as this product roadmaps, right? So you have the investment at the infrastructure level, the investment at the tooling,
and then the investment at the actual value delivery for businesses. But how does this map into each other and how does this actually, you know, become kind of this product mindset? How do you iterate from those things? And then finally, this is the same thing for teams. We're starting to see this concept of squads coming into the machine learning space, this cross-functional, you know, feature-driven iterative combinations of researchers,
but also engineers that are delivering value as they would with a product. And similarly, starting to see the rise of this machine learning product managers and program managers that, you know, organizations are starting to really standardize towards.
Now, a couple of final things from an organizational perspective is that also you have to take into consideration that when it comes to all of these elements that we are talking about, this will come as your complexity increases, right? So it's not a big bang that you should just start with all of this complex infrastructure and bring in the full wrath of
Kubernetes and, you know, bring in, like, everything, like, scalable for a billion users from day zero. But instead it's actually thinking about this in an iterative way, right? As you start bringing more, you know, a few models, you know, the common thing to see is a combination of data scientists, data analysts. As more models start coming in, we start seeing
that machine learning engineers start coming into the picture because otherwise, you know, data scientists just end up with a lot of the operational burden. And then as there is an increase of that, there is this sort of more specialized machine learning platform engineering roles. And then you start seeing also that increase of those elements of automation,
standardization, control, security, observability. And then similarly, the way that we think about it in this sort of, like, product mindset, when you have basically the group of data scientists, you may have them focused on a particular use case. But this product mindset is to think about how you can enable those data scientists to be able
to deliver value in a way that increases without having to also increase the headcount, right? Like, not requiring that linear growth in terms of the number of people with the number of models or the number of, the amount of value that you're delivering. And as part of that, that is when you start seeing that automation, right? Like, pipelines are introduced so that the science experts are able to start, you know, creating value
and repeatable value. And then as it starts becoming kind of, like, more and more sort of, as you start increasing this sort of, like, product perspective, then you start being able to increase even the level, the higher level in which it's operating, right?
So then it's actually data analysts or even business stakeholders that are interacting with this data and machine learning products to carry out the outcomes. So yeah, so that's ultimately, yeah, the main areas that I want to highlight. Just to wrap up, one thing to remember, and this is very important, right, is that from all of the
problems in the world, right, there's a very, very small chunk that actually are solvable and should be solvable with machine learning, right? So when you're running with a hammer, everything looks like a nail, right? So we have to remember that the first question is whether machine learning is actually relevant. Most often and statistically, the answer is no, right? So yeah, that's just something to keep in mind. And also the last thing is that we have
to remember that as practitioners, you know, we do have a big impact and we have a lot of potential to drive value and change within this space, right, because a large and growing amount of critical infrastructure is depending on machine learning systems that we are,
you know, developing, that we are maintaining. And always the impact is going to be human, irrespective of the number of abstractions and data products and machine learning products. So that's always something to remember. So, yes, with that said, it was a lot of content, but I'm glad that everybody's still awake, I think. So thank you very much,
and I'll take some questions now and maybe more in the pub later. Thank you. Amazing. Thanks so much, Hélène. That was riveting, honestly. So there's definitely a question. Is there a roaming mic, but there's also a stationary mic.
Hi. What do you think is the optimal team size in the future when you are running machine learning products in a company? Is it more like a hierarchical structure or a small independent teams? No, that is a great question. I mean, I don't think there's a silver bullet number,
but what we started to see is that there was more of a ratio. And what we started seeing is basically that ratio of not even to say number of data scientists to number of machine learning engineers to number of machine learning operations or platform engineers, but also to the
outcomes that these are providing. So it's similar to when it comes to the questions of how many software engineers or how many engineers would you want in an organization. I think it also becomes a bit more of a ratio that ultimately is to the infrastructure and to the overhead. The trend that we tend to see is that machine learning engineers are included
once the data scientists are just getting overwhelmed on just doing operations and engineering and they are no longer actually doing data science. So yeah, I would say it's more of a ratio from what we see than a specific number. Thank you for the great talk. My question
is regarding the new generative AI that's emerging and all the services, you know, AI as a service that appears and the high quality of what we see. Do you think that's a threat for AI researchers and data scientists in terms that we can't build like regular data
scientists cannot build something like GPT? So just to make sure that I got your question. So it's your question that you're seeing a lot of innovations in the large language model or
just like generative AI and is your question about researchers not being able to compete with that like amount of hardware that is happening or is with the use cases that are coming out? With the use cases, I mean in the past when you need a new product, you hire some researchers, they build it and now you just get GPT.
Right, okay, okay. Well, I mean, so I'm a bit critical of all of this like generative AI hype because I mean there is certainly like an element of huge potential and huge value and there definitely will be a lot of transformative elements around it. But something
that I have been seeing that is becoming a little bit more clear and more prevalent is that what is converging towards is not that you will end up with a complete automation of just researchers will just not exist and will not research like researchers or engineers or even like full stack engineers that are creating products, but is that those domains will evolve
and the practice will be acting at a significantly higher level, right? So they will become experts in building something along those lines and we're already seeing that right in those
sort of complex agent chain architectures that I was talking about that starts delving into the concept of this data flow machine learning where I would say the more boring and traditional best practices of MLOps becomes even crystal clear, you know, critical, right?
And that means that the jobs for machine learning engineers, machine learning operations, platform engineers, I mean, they're going to be there and also like for data scientists, I think. But yeah, it is exciting. There are, you know, still quite a while to go to really get it right, I mean, from what I see, but yeah, that's my perspective that it's a little bit
more of kind of like an evolution of profession, yeah. Thank you for the presentation. So can you recommend, for example, three frameworks that help with running a machine learning pipeline
in production? You know, top three, your favorite. My top three. So I mean, that would be a hard one. What I would point you to is to have a look at the two things that I showed. The first one is the canonical stack because that will actually give you a lot of guidance
for each of the frameworks that are recommended and that are very popular. And I will also recommend you to check out basically the list of MLOps frameworks. Yeah, I don't think, I mean, yeah, it's how long is a piece of string? I mean, what I would recommend actually, there was
this article that we shared in our newsletter just last week, which was called MLOps at reasonable scale. And they actually had a practical MLOps pipeline end-to-end that was kind of like balancing on effort together with scale, scalability. So yeah, I mean, that would be a good one to
get started with. So yeah, sorry for not giving like a very specific one, but you know, this should give you enough for you to find the ones that are right for just playing around or actually bringing to your environment. Yeah. Thanks. Okay, we've got two minutes. So
can you do one minute on each question? Hi, so I'm wondering, Dennis, you're talking about experimenting with models in production. What do you think about practices there? Is there
any good framework to follow? So with experimentation, what we see is mainly to have basically those considerations of experiment tracking, artifact management, lineage of metadata. So I think, you know, really kind of following those
suggestions of the tools of the canonical stacks gives you an intuition of what are the best practices that need to be enforced within them. Because for example, the lack of having that experimentation management tool or model tracking tool would mean that yeah, you just don't have that best practice. So yeah, I would say probably along those lines, yeah.
And then final question. Hi, thank you. So I feel like in recent times, there's this move from just data scientists to machine learning engineers doing like end to end. So I'm just wondering, what do you think about that? Will we move from a machine learning engineer,
which is able to do both data gathering, machine learning, developing the model, and then implementing it? Or you'll see all the different positions still existing? Yeah, I mean, so like with a full stack engineer in the traditional software engineering
that does front and back end, I think in earlier stage products and even earlier stage startups, you may find individuals that may actually master both and be unicorns and be able to do all of them. But what we are starting to see is more consolidation towards a specialized skill set. So still with data scientists, of course, getting a little bit more of that engineering acumen,
that is definitely a trend. But ultimately, the machine learning engineers being the ones that focus more on productionization, but still with the knowledge around the machine learning specialization. So not being as much of an expert as a data scientist is for data science, but same the other way around. So yeah, I mean, analogous to a full stack engineer
in software development, yeah. Okay, I mean, we can, if anyone has a really urgent question, or you can maybe catch Alejandro in the break between the keynote, which is, by the way, on LLMs, so nice segue. Okay, well, let's give Alejandro a warm
round of applause. Thank you.