MLtraq: Track your ML/AI experiments at hyperspeed
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 131 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/69477 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202457 / 131
1
10
12
13
16
19
22
33
48
51
54
56
70
71
84
92
93
95
99
107
111
117
123
00:00
Error messageInformation technology consultingProduct (business)Video trackingData modelParameter (computer programming)AlgorithmWeightVideoconferencingRevision controlPredictionCodePhysical systemMetric systemIntegrated development environmentComputer-generated imageryScripting languageMetadataoutputProxy serverFrame problemArray data structureScalar fieldElectronic mailing listTupleFeedbackTimestampEndliche ModelltheorieSet (mathematics)Parameter (computer programming)Shared memoryData dictionarySoftware frameworkAssociative propertyDifferent (Kate Ryan album)Process (computing)Variable (mathematics)HypothesisType theoryLimit (category theory)Regulator geneFrame problemLogistic distributionCodeLoop (music)Integrated development environmentNumberArray data structureConfiguration spaceInformationMusical ensembleFunction (mathematics)Exponential functionPoint cloudoutputPerformance appraisalWeightProcedural programmingWave packetAlgorithmPredictability1 (number)Boolean algebraBitMetric systemSoftware developerNetwork topologyString (computer science)ForestDecision theoryNeuroinformatikLinear regressionDataflowMarginal distributionComputer animationLecture/Conference
06:31
Local ringMathematical analysisSoftware testingInformationInterface (computing)Different (Kate Ryan album)CASE <Informatik>WebsiteSoftware frameworkResultantDatabaseRelational databaseLocal ringBus (computing)Computer animation
08:22
Video trackingArray data structureScalar fieldMathematical analysisStatisticsCorrelation and dependencePunktgruppeFrequencyObject (grammar)Software frameworkMultiplication signDenial-of-service attackObject (grammar)Category of beingProcess (computing)Array data structureLimit (category theory)Right angleResultantDesign of experimentsMetric systemMetadataBitIdeal (ethics)Performance appraisalInformationComputer animation
10:18
InterprozesskommunikationHuman migrationOperations researchDatabaseWeightFile systemComplete metric spaceDampingThread (computing)PlotterType theory2 (number)Software frameworkMultiplication signDifferent (Kate Ryan album)MathematicsResultantHuman migrationPhysical systemWritingComputer animation
12:09
Multiplication signWeightProcess (computing)Source code
12:35
ExponentialabbildungData modelAttribute grammarDataflowTable (information)DampingConfiguration spaceEndliche ModelltheorieDifferent (Kate Ryan album)Attribute grammarNumberRow (database)Multiplication signWeightBitComputer animation
13:49
Data modelAttribute grammarOptical character recognitionInstallable File SystemContent (media)Table (information)Computer filePhysical systemBlock (periodic table)Overhead (computing)DatabaseReduction of orderThumbnailComputer-generated imageryMiniDiscSpacetimeOpen setData storage deviceFile formatSemantics (computer science)Binary fileStapeldateiDataflowSoftware frameworkThermodynamisches SystemDatabaseSlide ruleEndliche ModelltheorieDifferent (Kate Ryan album)NumberComputer fileSubsetAttribute grammarSystem callInterrupt <Informatik>Multiplication signPoint (geometry)CASE <Informatik>Insertion lossRow (database)File systemSerial portWebsiteSemantics (computer science)Table (information)2 (number)Profil (magazine)Binary codeTime zonePhysical systemOrder (biology)OpcodeSingle-precision floating-point formatComputer animation
16:50
File formatOpcodeObject (grammar)Binary fileException handlingElectronic mailing listOpcodeWordRevision controlFile formatComputing platform1 (number)Different (Kate Ryan album)IntegerPortable communications deviceConstructor (object-oriented programming)Computer animation
17:51
Physical systemRead-only memoryEvolutionarily stable strategyInterprozesskommunikationType theorySoftware development kitSemantics (computer science)File formatData storage deviceInformation technology consultingProduct (business)Projective planeDatabasePoint (geometry)Uniqueness quantificationSurfaceWeightInformationSemiconductor memoryOpen setFile systemType theoryFile formatThread (computing)Observational studyPresentation of a groupProcess (computing)Overhead (computing)Semantics (computer science)CodeRevision controlComputer fileFitness functionINTEGRALTelecommunicationSoftware frameworkBitConnected spaceWeb 2.0Physical systemCollisionBarrelled spaceDifferent (Kate Ryan album)Streaming mediaCategory of beingMatrix (mathematics)Computer animation
Transcript: English(auto-generated)
00:04
Thank you. Thank you for for coming to my session and yeah bear with me after lunch so let's let's get started and First of all, why why are we tracking what why we have an interest in tracking experiments? Well
00:27
As we as we want to explore understand. What is the impact of changing algorithms or data sets or parameters? We enter this loop where you have an hypothesis on what could work and And potentially can improve performance then we design our model
00:44
We train our model we tune our parameters and then we evaluate What we built and then? We might decide to keep repeating the process and this is basically the experimentation process and
01:01
Along this journey you keep tracking To keep a feedback loop of what's happening to see if you are improving or not and What kind of information We might be interested in tracking well here we see Big cloud of things that we might be interested in tracking could be parameters inputs predictions metrics
01:26
Metadata about who's trained model who's evaluating the model debugging information code Configurations environmental variables anything that can be associated to the performance of our modeling or
01:41
that can impact our experiments and So the topic of this talk is about why it is important To experiment in an efficient way. So why it is so important to be able to experiment fast well, once once you are able to
02:02
track Really fast. You don't care that much anymore about performance. So you're not selecting anymore What what you might we want to track or not. It's just more freely relaxed and start tracking more which means Potentially you are repeating glass computation because
02:23
you might already have tracked what you were interested in so you don't need to just run things again and This Also means that you can iterate faster and experiment more. So at the end of the day if tracking Becomes really fast. You just tend to track more and you are more efficient and also the
02:45
experience for for the for the developer or researcher is also improving and Let's now let's now have a look at how can we model an experiment? So what is an experiment? We will then use this framework this
03:03
This way of representing or thinking about experiments through through the rest of the talk Well an experiment we can model it as a collection of runs Where each one is an instantiation of our experiment with certain inputs some parameters here we see
03:20
We see here on the bottom side. We have for example a Train of oration procedure where given different inputs we get different outputs and this would Be would be represented by two runs in in this model and Here we see an example of experiment. So let's say that we have a classification problem
03:43
we try out we want to evaluate or consider different classifiers and And dummy classifier logistic regression decision trees around forest Now we might also want to experiment with different assets, right and
04:01
And you probably also want to to see how good or how robust are These models is classifiers depend on the seed so So that your performance not just good you haven't been just lucky So it's just that this is very simple
04:21
setup we get to more than 100 configurations with which in in what we just so would would be modeled as 120 runs and So this is of course a very important problem and they are very established the robust solutions to solve this problem here
04:44
I Captured the last couple days. I have updated the numbers What is the market share of different? solutions different frameworks for Experiment tracking and we see that ML flow weights and biases capture most of the market then we have
05:01
comment Neptune aim and many others That I mean things here don't talk up to 100 because I run it a bit up but overall These are the main players and yeah ML flow Weights and biases are basically the two most important ones
05:21
The others are basically ever very marginal market share the Somehow the limitation of this to this actually not not to but all these frameworks is This lowness and the limitation what kind of information you can track
05:40
So by by type limitations here, I mean What if we don't want to track just floats or? binary blobs, what what if we in our experiments we have dictionaries leads or Strings or timestamps booleans NumPy arrays or data frames
06:01
How can we track all the richness of the of the information we are producing in through your through our experiments? And this is what what triggered them What what inspired and triggered the creation of this new?
06:20
framework for experiment trucking and a truck Which I invite you to try and give it a try and see if it might be a good fit for for your experiments as well and and I will now show you a little example of how I met up works and this
06:40
This is basically also how all the others? How you can interact with all the other Frameworks we saw before so more or less the interface might be slightly different, but the steps Are very similar So first of all, we create a session to to to wherever we want to store our track information
07:01
And this case we track it in our local database, but can be anywhere and And then once we create a session, so once we decide where we want to store our truck information we create an experiment Which we call test? Once we have our experiment we can add runs
07:22
and in this case, we just add a dummy run where we set an accuracy value to float and Once we are done Running our experiment we might want to save the results of what we tracked
07:41
so and And once once you you you persisted the the truck information Of course, you might be interested in in querying it in retrieving it for for later analysis for comparing experimental results through different experiments and so on and Yeah
08:01
If we store the data, we tracked in In a relational database like SQLite, then we can also easily query it with SQL for example or if it is a Bus glass it it can also be used maybe by someone else in the team to to populate a dashboard and so on
08:23
And now that we saw a bit How overall How can we design experiments which are the frameworks which as? We will see are some of the concerns or limitations of these frameworks. We are ready to experiment a bit and see
08:42
How these different frameworks perform and and what makes them fast and slow which I think is very interesting and we will we will look in particular as a bit of tracking floats or arrays of floats and by instrument for for profiling and
09:03
Let's get started and we will see results about three categories of What could be slow? So let's say that we want to start a new experiment and we do our tracking So how much time does it take to start tracking actually? How much time does it take the tracking process to?
09:23
just kick up to just start and And once we start tracking how frequently can we track our metadata our Performance evaluations and metrics and so on of course, we want to be able to track as frequently as we want right and
09:43
And once we can track as much as we want we can we might want to track Also, very large objects objects so that we are not really restricted in what kind of information we are tracking so Ideally, we start tracking really fast. We we track as frequently as we want and
10:04
Objects being tracked as large as we want so this would be the ideas in I right and Let's have a look at how these different frameworks compare perform and and how they look like so first experiment
10:22
we we try to Track Just one to ten floats. Okay. So here we see all the average results and That in this experiment we basically truck We simulate an experiment where we have just one
10:41
value or ten values with type float being tracked and Here on the left side left side we see this bar plot where We see that the best-performing Framework is Neptune and the slowest is weights and biases
11:02
Taking 1.6 seconds just to start and here and here we see that it's three nine times lower than the fastest one, so this is how you can read this plot and Okay, so from between the lowest and fastest. There's the difference of 400 times. So it's pretty big what makes it so fast
11:21
So slow well weights and biases Threading and IPC is quite slow in the way it is handled and as we will see threads and how locks are managed and The fastest is Neptune which basically writes directly to the file system. So it is basically nothing
11:44
Neptune is doing very little this way. Also is very fast. So the least you do the fastest you are and In the case of of the others is either writing to to the database or dealing with the database and workflow interestingly Even if the database doesn't exist yet, they insist in doing a complete alignment in migration, which takes some time
12:07
with all the schema changes and here we see an example of What happens in in if we if we inspect we profile weights and biases What's happening? So most of the time is actually weighted in a lock and then we have
12:25
some tracking on the waste and biases side and Which also is taking 80% of the time so kind of slow if you keep repeating this this process Let's say now that we just struck one float and we and we track it
12:43
Across 100 runs. So very simple experiment. We repeat it 100 times And And for each experiment, we just struck a float here. I thinks are a bit different So I'm a truck becomes faster and and I will explain why in a second While the was performing ML flow and within biases
13:04
Get varies very slow. So it's lower than linear. So it's has to be exponentially slower as we add more runs So if in your experiments you have a large number of runs large number of configurations You want to test? it gets extremely slow ML flow and with some biases and
13:24
why is ML flows is low. Well, it depends it relies on the what is called the entity attribute value model we're basically for Every time you you track value it you are basically interacting and appending rows in in different tables
13:43
Which is quite expensive. So if you do it in Sorry, I think I messed up Yeah, sorry. Sorry for the interruption So in actually this experiment what what we see we increase the number of floats
14:01
between 100 and 100,000 and So we have way more floats being tracked now. So one run high number of floats and And we see ML flow and within bias is still Slow but wait, my flow is actually much slower now than all the others. Why is that?
14:23
Well, as I mentioned entity attribute value model, so every time you track something keeps adding more records to different table, so it's very expensive and What is interesting here is that either you you you batch inserts or you keep inserting
14:41
inserting or tracking at the individual Data point Which means either streaming or batching tracking so depending on that your use case you might prefer one way or the other and Another difference here is So if we go back one slide
15:01
So what makes them a truck so fast in this case is the fact that it's it's writing to to to To an SQLite database, which is much faster than writing profile system This is actually a screenshot from an SQLite from the SQLite website Describing how this is possible. So open and close system calls are very slow and if you write
15:26
If you let the SQLite handle the single File on file system with all the data It's gonna be much faster than than just directly writing the maybe potentially smaller files more frequently
15:42
Let's let's now try to track 1 million floats and see what happens while in this case Western bias is still quite slow 2.4 seconds ML flow follows aim follows, I mean the Ordering of the different frameworks now remains pretty much the same as before and
16:05
How can I'm a truck be so much faster than the others or at least than the Then there was performing. Well, it is relying on on a safe subset of the pickle op codes
16:20
plus the numpy Native serialization, which is very efficient. So Compared to the others that basically write either G zone or binary blobs because they don't actually support Much more else and and in this way you really lose Semantics, so it becomes
16:42
you have all the kinds of problems which don't really relate to time performance, but interesting to highlight So few words about safe pickling And why I think it's quite interesting. So usually you hear the Pickling is not great because it's unsafe or it's not very portable
17:04
but if you restrict The pickle op codes to the ones that are just safe. So for example, you can construct lists scholars like floats integers and and many others without really incurring in any dangerous op code, which is quite good and
17:24
If you fix the if you decide on which pickle version that you want to use It's Quite portable across different different platforms, which is quite nice And you don't really need to add more
17:41
Performant Custom packaging or packages or or formats you can just rely on what is already there. It's already super efficient No need to add more Let's now and here we see That point what happens if we try to to write 1 billion
18:01
bytes here in data and Here where the other frameworks basically cannot really reach this performance anymore They just get too slow. So here we see how how things perform with Emma truck different versions of how you can actually store the
18:20
Data being tracked What we see here Two versions either we write directly to the file system, which is of course the fastest and Without without relying on without the overhead of SQLite And we talk FS and then we have a matric TB men where we basically
18:44
Have an SQLite database, but in memory, so we do have all the the fancy Properties or We have all the capabilities of having a database in memory so we can quit actually the base very easily
19:01
But we are not really writing through the file system And lastly we have an SQLite file stored in the file system as we see as we as we increase The size of the array we we are tracking the database the file system the SQLite database stored in the file system gets pretty slow still
19:25
If instead of writing to the file system you just keep your database in memory, it's pretty fast It's pretty fast as collide Not that that much slower than writing directly through the file system So but still if you just write to the file system could be could be even faster
19:45
Then of course it depends what how easy then you want to be able to process and assess the data you just tracked and Yeah, so a few conclusions here So threading IPC, which one is better? It depends on how you are on how you are handling your
20:08
Your Communication either with threading or Distinct processes and how you are interacting how you're then able to impact some Sorry, it impacts how you can actually then query or insert the track information. Is it a web API?
20:27
like within biases or you have That connection to a database or you're not relying on Custom integrations or You are casting from
20:42
Integrations then you can also decide batching streaming. Which one is a kind of tracking is might be a good fit for you and As we saw native SQL types Python types barrel Open formats are all we need to have very performant tracking
21:01
We don't need any more you encoding or just online formats, which are kind of slow and they don't really much They don't offer each semantics on what you just tracked And and what I showed you so this presentation impacted so what is the impact of this presentation or the study this project I'm a truck
21:23
besides Besides providing value on its own in a kind of unique way It already brought to To surface at least for weights and biases some performance issues that have been solved in the latest version of the SDK so
21:45
Happy to say that this project indirectly contributed a bit to increase significantly the performance of weights and biases tracking and With this I'm very happy to conclude and answer any questions. Thank you
Recommendations
Series of 3 media