Deploy your Machine Learning Bots like a boss with CI/CD
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 130 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/49954 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202049 / 130
2
4
7
8
13
16
21
23
25
26
27
30
33
36
39
46
50
53
54
56
60
61
62
65
68
73
82
85
86
95
100
101
102
106
108
109
113
118
119
120
125
00:00
Machine learningBoss CorporationVirtual machineRobotBoss CorporationGoodness of fitExpert systemTime zoneTouchscreenComputer virusComputer animationMeeting/Interview
00:46
Boss CorporationMusical ensembleVirtual machineContinuous integrationChatterbotLibrary (computing)Presentation of a groupE-learningEndliche ModelltheorieComputer fileHand fanAnalytic continuationMusical ensembleCartesian coordinate systemCASE <Informatik>RobotSoftware frameworkGame theoryPersonal digital assistantCurveStudent's t-testComputer animation
03:10
Virtual machineLaptopDatabaseEndliche ModelltheorieData storage deviceSoftwareService (economics)PrototypeComputing platformQuicksortSequelComputer animation
04:14
View (database)Video gameProcess (computing)MathematicsCASE <Informatik>Integrated development environmentCurveSoftware frameworkMathematical analysisLine (geometry)Dependent and independent variablesChatterbotEndliche ModelltheorieWave packetTunisRobotParameter (computer programming)Data conversionHypercubeCombinational logicProcess (computing)Library (computing)WordLevel (video gaming)DiagramConnectivity (graph theory)Operator (mathematics)File formatProduct (business)Machine codeCycle (graph theory)Multiplication signMereologyPrototypeDataflowVisualization (computer graphics)IterationValidity (statistics)1 (number)Software testingView (database)Type theoryStandard deviationMessage passingProfil (magazine)Cross-validation (statistics)Revision controlSoftware developerVirtual machinePiMusical ensembleExpert systemProcess modelingComputer scienceService (economics)Video gameSoftware engineeringReading (process)Computer animation
10:09
View (database)Data conversionEndliche ModelltheorieFeedbackLoop (music)LogicMetric systemThresholding (image processing)Row (database)Software frameworkPoint cloudSoftware testingReal numberCASE <Informatik>Machine learningPoint (geometry)RobotDemo (music)Software developerCross-validation (statistics)WordLibrary (computing)QuicksortIntegrated development environmentFile formatRepository (publishing)MathematicsIterationVirtual machineGoodness of fitThread (computing)Machine codeKnowledge baseServer (computing)Type theoryRepresentational state transferBranch (computer science)Level (video gaming)Product (business)Operator (mathematics)Key (cryptography)Software engineeringChatterbotRevision controlWave packetMereologyView (database)Traffic reportingTrailEnterprise architecturePredictabilityParameter (computer programming)RoboticsVideo gameMoment (mathematics)GoogolComputer animation
16:04
Continuous functionContinuous integrationIntelFAQMathematicsRobotCASE <Informatik>Integrated development environmentAuditory maskingVideoconferencingMusical ensembleCommitment schemeContinuous integrationYouTubeSoftware developerLink (knot theory)Pie chartFile formatWordBranch (computer science)Video gameGame theoryWhiteboardMedical imagingLocal ringMultilaterationComputer animation
18:54
Server (computing)Asynchronous Transfer ModeTraffic reportingWikiMathematicsTrailProduct (business)Endliche ModelltheorieRobotCommitment schemeDifferent (Kate Ryan album)1 (number)HistogramWave packetLoop (music)QuicksortConfidence intervalPredictabilityComputer fileDistribution (mathematics)Data conversionVirtual machineFeedbackSoftware developerWeb pageLink (knot theory)Server (computing)Branch (computer science)Goodness of fitLevel (video gaming)View (database)VideoconferencingMoment (mathematics)Template (C++)File formatDiagramSoftware testingComputer animation
22:54
WordConfidence intervalNumberToken ringType theoryData conversionMereologyLimit (category theory)RobotEvent horizonWhiteboardCASE <Informatik>Computer animation
23:55
Continuous integrationCollaborationismTelecommunicationVideo gameComputer animation
24:16
PiScripting languageProcess (computing)Point (geometry)Data conversionSoftware developerComputer animation
24:42
Open sourceProjective planeLink (knot theory)VideoconferencingMathematicsDemo (music)Server (computing)Point (geometry)
25:11
CurveIntegrated development environmentMobile appView (database)Meeting/Interview
25:53
Video gameView (database)Cartesian coordinate systemProjective planePoint (geometry)Repository (publishing)Software developerHand fanCore dump
27:01
Data conversionDifferent (Kate Ryan album)Group actionWeb pageMultiplication signOnline chatProduct (business)Link (knot theory)GoogolMeeting/Interview
Transcript: English(auto-generated)
00:06
So next is William Arias, and he's going to talk about deploying the machine learning bots like a boss using CI CD William Arias you you work for GitLab, is that correct? Hello. Yes, I work at GitLab. Yeah, perfect
00:23
So you're the expert No, so right where you joining from by the way, and I am from Colombia But I live in Prague in Czech Republic. All right, so it's a good time zone for you now Okay, excellent, okay, so please start sharing your screen and then you can take it away
00:45
Okay. Hello everyone. So my name is William And today I want to share with you some of the learnings I had since I started like back in 2016 working with developing bots and I'm big fan of NLP and the well the use cases of a chatbot and
01:05
In this in this talk I want to introduce you to How you can step up your game using principles of continuous integration and continuous delivery using Rasa the framework for chatbots and GitLab CI CD
01:21
so why why why I decided that I wanted to share these these learnings with you because When you start working on this as a hobbyist or as professional I have done both Usually what you do is that you start behaving like a one-person band so I was reviewing the agenda of the talks today and
01:44
If you have been following as well You can see that there are presentations that they are teaching you how to create a docker file how to deploy a model using Docker how to serve it. So you have to create an API and Besides you have to use other tools For to create a model itself and at the end we start behaving more or less like this
02:04
Like like, I don't know if you if you like the symptoms when I was doing that I remember this guy because we are a one one-person band. We are playing all the instruments. So You start becoming or you are expected to deliver an application that contains that has machine learning embedded you more or less have to behave like a one-person band and know a lot of
02:26
instruments and to Learn how to play these instruments. There is a learning curve that might be frustrated I was the other day Teaching assistant Yeah teaching assistant in some online course where the students
02:43
They were able to create the model for a chatbot but they when they were asked to please deploy to AWS and show us give us an endpoint where we can connect a channel and usable Many of them they start struggling because that's not what you are supposed to necessarily
03:01
Be aware of you know, the models, you know How to try how to use different libraries to create the to classify intents to classify entities But not necessarily all the tooling that you are expected to more or less Have some some some knowledge to be able to serve a model so
03:21
example Let's say you use tensorflow for something for To create some sort of some classification problem to solve some classification problem so you prototype in Jupyter notebook and once you prototype in Jupyter notebook you If you want to store and probably you will need a database So you need to know which one should I use should I use something like asandra?
03:43
for example poser my sequel and then as same before Then if you start digging into this you will For sure, you will find okay So I should deploy it as a container and then qern it is and how do I deploy should I create a virtual machine? Or should I go directly to the platform as a service or should I just use an API software as a service? Whatever?
04:03
And next to serve to the other world. So all of these things They are more or less what they call today the fully stock machine learning engineer or data scientist. I don't know and I Want to share with you What was the approach that I follow and how today if your purpose or if the challenge that you are facing is I?
04:24
Want to create I don't know my own startup in which I will build chatbots for whatever business and You want to be fast and you want to deploy in a very very like close to? standard professional way but also
04:42
With iteration in mind and being fast. So I want to show you how I did it How am I doing it? And this is not like written in stone This can change and all these port flows you if you know the principles then you can adapt it to your own workflow So how is that the how I started to?
05:00
Divide this this problem when I say it's taking me too much to deploy a chatbot to production It works in my machine the typical one it works in a machine is very easy to run it in localhost But I want to deploy it and and if you are doing this for yourself as a one person band or for a company You will find these these challenges
05:21
So first I started thinking of it like what they are calling these days machine learning operations that this could be a workflow of that This talk is not necessarily about mlops, but it borrows some concepts from there So first you are aware that the model that you are creating Has some some life cycle in which you have to go through all these steps that these are the ones that
05:45
We are we are expected to be knowledgeable in how can I ingest the data? How can I version it or label it and run the validation test training model fine tune it hyper parameters analysis Reading all these learning curves and then deploy and then get feedback
06:02
So this is the only the model but one of the things that I learned that is very important When you want to scale these solutions and when you are not only developing for yourself This even helps if it's for yourself because When you move to the process view of what you are attempting to do the first person that will be grateful
06:21
Three weeks later or a month later. It's yourself. You say oh, that's so good that I put it in this in this view Because now you understand All the steps and more or less how you can split the responsibilities And I put here that one of them is a dialogue contributor because one of the things I have found while working developing
06:43
chatbots with rasa library or others is that You might be the let's say expert. I don't like that word but you might be the most knowledgeable person in how to build the the the bot But you are not necessarily the subject matter expert of the topic that the bot is supposed to provide a service
07:03
And in many cases the subject matter expert is a person that they are not developers or they are no they don't have any Computer science or engineering background But they are the ones that they know what should be the training data that you fit to the bot They know what type of questions the the end users will will ask so in this way you also
07:25
Can start enabling other type of profiles in how can I incorporate their their contributions to my process to create a better bot So this is the process view That it contains many of the steps from the model lifecycle view
07:41
And what are the benefits of this when you when you do this first is a favor that you do to yourself But also you have a common understanding of this process for everyone So if next time there is an iteration that something should change now It's very easy for let's say some some new employee or or for you one month later to retake and say, okay
08:01
This is this is what I was doing here is where I am and and when you Use this process modeling as well. You can identify bottlenecks or where the things that you are doing might be failing And also this enables you with some standard that will enable iteration So you can improve with the pass of time the performance of the chat
08:20
And now like how does this look so the components view? and here This is the high level diagram of what I have architected that it works to a very decent level in which this this could be the Pilot or prototype but very decent not not poc hackathon level very decent pilot that we will go
08:45
Step by step and I will show you starting from from this part. So What is in the in this part in the left the dialogues or python code? This is your development environment. This is where you have your pie charm or visual studio sublime text Whatever here is where you have
09:01
the your code Someone can be getting dialogues In the certain format for your bot. So the only thing you do once you invest time learning principles of CICD is that you work in your machine Once you are happy with your changes or with whatever you did in your local environment you do git commit and git push to remote
09:26
So what I am doing in my workflow is that I create I make a change or in this case for this example I am adding more dialogues to the bot so the bot knows Knows more combinations and is able to have better conversations
09:41
This push will automatically and from here everything is automatic Back in the day, let's say four years ago when I started before using this You have to use the command Command line and know each one of the commands that for example the rasa framework provides to do rasa train
10:01
rasa cross validate or rasa Test data all of these things. So here what it's what is happening is that Once I commit the code Automatically using GitLab CI it will start it will start some runners or containers or it's called runner that each one of these
10:21
Will run the steps that you usually will run manually So this is not nothing new for someone who is who has been working in traditional development for years But this is more tailored for people that maybe they come With a math background or they are breaking into machine learning and they would like to build bots
10:41
This is not necessarily knowledge that they have at least for example I didn't have it four years ago when I started these these good practices of solver delivery. So Now that I discovered that's why I said oh, this is this makes my life easier. I want to share with you So these these steps that are usually executed manually they start running in different machines
11:02
So one one runner is going to validate that the data that you push To to the repository. It's okay. The format is is there with the one that they bought requires Then the the data is complete And it will train the model and it will test the model and by test the model here
11:20
Is the concept of testing in machine learning this with you can you can either create a testing test data set That you will run against your model or you can do for example in this case on my demo I do cross validation With the data that I have available. So assuming that each one Of these steps
11:41
It's successful then the model gets created So you have an artifact now and the the artifact is your model And if you are happy with the performance because when you test the model You will get some metrics of the performance of your model and if you are happy with that you say, okay So now I want to do a merge request and and I want to put my bot into the main branch and this will deploy and this will
12:05
Deploy to production or staging of the product and here this part of the paper tray. This is this is one of the the key key key steps in machine learning operations that I am borrowing and bringing it into this into this pipeline of
12:21
serving chatbots is that Whatever you do whatever you you change you make In your dialogues or in your code this by running these pipelines. This will create a paper thread and what does it mean you will have a record of For example, this is a snapshot of today the mo the model that was created today
12:44
What type of the training data it used? What was the code at that moment? So in other words, this is version control that is getting into a record That I post in a wiki and if someday you work in a company that they will get some audit And they have to show show me why your model
13:00
Was executing these predictions so you can easily go to the history of commits and say look This model had this training data and these were the parameters and here is the code and here is the data that we use to train So this is the paper trick that this is required in in in the enterprise And also if you are just playing this paper Helps you to keep track of your experiments and have some sort of report in which you can
13:24
Compare if you if whatever changes you were making was improving or deteriorating the performance of your model So once the model is ready and i'm happy with everything and all the steps and the testing From the software engineering point of view and machine learning is successful and i'm happy with everyone everything
13:43
What I do here is that using this Framework It provides a api rest That you can deploy a model server The model server in whatever you want in my case. I'm using it in world cloud and then you can use this api
14:01
api rest api to put And make it active and once the model is is is there Then it's it could be available To be tested by real users or or other people that they can serve as testers of your bot of your dialects
14:21
And but it doesn't finish there because until this point I could be quite happy that the only thing I had to do was Let's update the knowledge base of my bot Let's just do git add git commit git push it goes automatically through all of these steps. I don't have to worry about
14:42
Putting manually Executing commands and checking performance metrics because I can build a logic that if it's above certain threshold. It's okay But once you are there and the model is served In this model server and it's ready to take conversations It's very important to close the loop with some feedback and the way that i've been using this and I will show later is that
15:04
I am using the same Rest api that provides the rasa framework to extract a record of dialogues or conversations that the bot has been having with real users So what what you can do here is that when you expose it to the outer world and the bot is interacting with real
15:26
people using different Ways to say hi or to express an intent you can bring this back to to to To your environment and some sort of data playground and here you can do what you like the most that is okay. Let's use
15:41
pandas python libraries to analyze To to do nlp in other words And from there you can extract insights to later feed it back into the bot And retrain it and start continuing this what I say before This Loop that enables interage iteration to improve the bot performance
16:05
so more or less what i'm telling you is that You you could be a one person Band, but you can also bring a conductor to your orchestra someone or something in this case technology That will help you orchestrate all of these manual steps and at the end of the day by using
16:24
These principles of continuous integration and delivery will step up your game creating bots i'm very important Really? It will make your life easier. I am a testimony of that. I used to to to before as I was saying with the the other
16:41
with the other Talks I I had sometimes when I wanted to make some change and the change was minimal like adding just one more sentence Or changing it happened to us Building a bot for this topic that Have it here Have us here online. That is When we were building a board like two months ago for frequent ask questions about
17:05
Coronavirus One day we were the bot was supposed to answer No, don't use mask because mask should be used only by health professionals one day later. No, no The bot should reply. Yes. Yes, you use create make your own mask and please use it
17:22
so it was only one single change in one sentence of the bot and it will imply that you have to maybe jump into the docket image make the change many things here. The only thing you have to do is Just put it in your in your pie chart in your local environment Update it git commit git push and this will take care of that
17:44
so Uh, this is how in this case i'm in powerpoint keynote, so everything works in keynote, so I want to show you a video of how it could look like so This I will share with you also a link. This is paulic on youtube
18:01
So here typical developer workflow i'm just creating a branch in which I want to work on my changes So once I create the branch i'm using the this is the rasa format To or to to add intents or knowledge to the bot i'm just adding two words
18:21
Okay, I am i'm just adding two words here Um git add git commit and i'm here after I I commit the changes and I will push I will push them here is where the where the magic starts because I'm pushing to to remote and in the remote which is git lab is where I have configured all this
18:43
orchestration with my new newly hired conductor for my orchestra So typical typical let's say developer workflow This is creating a merge request that when I come here And I this started automatically once I did the push
19:01
Automatically this started to run this pipeline One of the steps are ensuring that when the bot is ready is because it's really ready to be deployed to production And each one of these steps what they are doing is what I described in the diagram before Is testing that the format or the template that we need to feed the training data to the bot is is correct
19:24
It's testing It's creating the model And once that is creating a model is testing the model. So Typical developer workflow with the good practices. So here what i'm showing is that i'm going one runner by runner And these are the the different machines that are executing
19:44
Automatically the steps that I that I if I don't have this I would have to do manually in command line So by using these runners I am orchestrating automatically all these stages following good practices of development and here this is how my my my workflow or pipeline
20:03
Looks like this is what the the bot The what I did what I provoked with this git commit is that is as I mentioned before these are the stages That it for this branch. They will end up writing a report that the report is to say is this what will show me that
20:21
The different artifacts that were created like these ones Different artifacts that get created when I train the model such as I don't know histogram showing the prediction confidence distribution Or a json file that is showing me also some some sort of report of how the bot
20:40
Performed based on that but this json is pretty helpful, but it's not Nice, let's say for a business person that they just want to show me show me how it looks like You know a nicer way. So when I this is step that is called write reports What it's doing is in same git lab is writing in the wiki uh The report of the experiment where you can auto generate this thing that you can later revisit or or
21:07
Keeping some track of the changes of your experiments of the or of the model that is in production So once i'm happy with everything I can merge the the changes to my bot and here When I start merging this is a view of the rasa rasa
21:23
Model server in which as you can see that moment when I when I took this video The model that was active was a model of last sunday, but here this was wednesday I'm running a pipeline that is merging the changes into the main Main branch that once this this pipeline will will finish
21:44
What it will show show us here. It's about to to finish in the deploy stage There right there what is going on? Is that? my my pipeline is Talking to the model server to rasa server and it's saying I am ready the model went through all of the steps, please deploy
22:05
And once this is successful as we can see there I will refresh this page and you can see that now the model is ready was automatically deployed and made active And from there I can use tools like the ones that rasa offer to create a link to share this newly created bot
22:23
With external people so they can test it and I can also leverage an approach of conversation driven development because from here Before deploying before following these steps that I just followed and deploying the production You can have more feedback of how the bot is performing So as as if you appreciate the the beauty of this you were working in python you made two or three changes
22:45
You git commit git push and all of these things Started automatically and to finish the to close the loop I want to show you the last part Is that in this case? I am I am using the api
23:00
from rasa and i'm bringing Okay, we have five minutes and i'm using the the api from from rasa to bring some of the Dialogues of conversations that they both had so this bot was newly created only for this event. So it doesn't have too much too many Too much data, but this can give you an idea that here the sky is the limit you can say
23:22
okay, I want to filter by number of intents, but the Confident highest confidence of the best intent or what is the type of token that people are using the most are they saying? Hello the way I thought or are they saying hello. I want this are they using To move the intent in one sentence and for example in this one
23:41
I was just extracting the last word and this for example could give me a glimpse that look If the people are saying i'm a bit mad or i'm quite furious. It's probably the pot is not it's not Working in the way that I expect so to finish that I show you how it looks like And if you adopt these continuous integration and delivery practices
24:04
It will allow you to have an orchestra for yourself with some conductor that will help you to make your life easier If you scale this up to a company this will increase communication and collaboration You can have a more reliable solution and at the end of the day It's setting you up to grow more from there
24:21
You can grow from the manual development to the pipelines that I just showed you to another point where you can Think of fully automated processes like instead of using this what I was showing you an api to call the rasa api To bring in conversations. I could automate this job every day 8 pm and run some preliminary scripts
24:40
And yeah, so this is like the the next step if you want to know more I have some links here And I will make available this started project that you can fork add from there. You can see how I I Orchestrated all of this use it using gitlab and rasa from this project You can just copy change to your pointy to your own server and it will be ready to
25:02
To replicate what I just did or I just show you in this demo with you so Thank you, I think this is I have three minutes of questions Thank you very much. William. That was a very nice talk very interesting indeed. So we have one question The
25:20
Question is that what are the cons and pros of gitlab cicd versus having a jenkins server? Okay well If you already know jenkins, so what are the pros and cons cons the learning curve? If like me you came to this, uh Cicd without knowing jenkins so gitlab is very easy to learn
25:42
So that's one thing learning curve and one one one pro is that in gitlab you have everything on there The there are in a single app. You don't have to switch environment. For example here In this I have in one single view. I have the pipelines here. I can define Find everything and what I define in my project that will be my pipeline is automatically triggered and read by gitlab here
26:07
for example, so pros and cons pro if you don't if you if you are comfortable with learning this tool because it's very easy and everything is Pre-integrated and as I say will make your life easier so you don't have to know to think about
26:22
Okay, so I need to learn a new tool. I need to learn jenkins plus rasa plus my own development here What what the reason I like this a lot is because here is everything Everything is here under this single application and whatever I push to my to my to this repository it will trigger a pipeline in my
26:42
in my gitlab ci so that's what that's something I can tell you from a point of view of A gitlab fan but also personal point of view if you already know jenkins Learning core. Okay, go for the things you already know because at the other day You just want to make your life easier and deliver faster
27:01
Okay, thank you for that answer There's a second question. I don't know whether we have time but Is there a difference between gitlab cicd and github actions? Well, yes There are different products Yes, there are there are differences that Is we can take it offline
27:21
Because I'm this is moving me into the conversation of competition Right, okay, I can I can I can I can in the in the room Sorry in the room because all of this is is public for for for gitlab We have a competition page that if you google A gitlab versus github actions in google. This is public and you will find it there
27:42
It's not like it's not possible for me to talk No, you can do it and it's so transparent that you can have it only if you search for itself, but it's a quite long topic So, okay, so let's take that maybe offline into the Into the talk chat. We posted the link on the on the talk chat in discord And and so you can then find the talk room for this. So
28:01
Thank you very much arias for the very nice talk. Let me give you your applause Thank you. Thank you