CK: an open-source framework to automate, reproduce, crowdsource and reuse experiments at HPC conferences
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 561 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/44185 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Software frameworkSupercomputerAutomationOpen sourceService (economics)System programmingSet (mathematics)Endliche ModelltheorieObject (grammar)Java appletDrum memoryWorkloadGastropod shellCellular automatonNim-SpielMenu (computing)DatabaseBefehlsprozessorSheaf (mathematics)GradientBuildingVorwärtsfehlerkorrekturPeg solitaireWindowPhysicsComputer-generated imageryLink (knot theory)FluxField programmable gate arrayExecution unitIndian Remote SensingDenial-of-service attackInternet der DingeMeta elementScripting languageInformation managementThermodynamisches SystemClique-widthIntelFlagComputer hardwareSoftwarePortable communications deviceBenchmarkData compressionState of matterCompilerComa BerenicesOptical disc driveGraph (mathematics)VideoconferencingPermianMathematical optimizationAliasingModul <Software>Function (mathematics)MaizeCompilerCodeSimilarity (geometry)Shared memorySelf-organizationDifferent (Kate Ryan album)Scripting languageVideo gameUniverse (mathematics)Multiplication signPerformance appraisalCodeEndliche ModelltheorieCountingResultantFeedbackBitStudent's t-testProjective planeExterior algebraLinear regressionMessage passingRight angleCompilation albumView (database)Virtual machineSlide ruleLatent heatBlogPerspective (visual)Physical systemStatistical hypothesis testingCASE <Informatik>Machine learningTraffic reportingComplex (psychology)Beta functionMedical imagingSoftware developerComputer animation
05:07
Computer-generated imageryoutputMaizeModul <Software>Function (mathematics)Common Language InfrastructurePortable communications deviceSoftware frameworkAliasingData compressionComponent-based software engineeringMeta elementTensorDataflowData structureDirectory serviceModul <Datentyp>InformationReal numberConvolutionCompilerArmVideoconferencingStreaming mediaPoint cloudWorkloadEndliche ModelltheorieForm (programming)Addressing modeService (economics)AbstractionData dictionaryQuantumDatabaseBuildingWindowAndroid (robot)BefehlsprozessorCellular automatonFluxConvex hullTwin primeCore dumpMachine learningComputing platformCodeSystem programmingThermodynamisches SystemCompilerVirtual realityCompilation albumModul <Software>Level (video gaming)Multiplication signCodeSoftware repositoryTask (computing)Set (mathematics)Plug-in (computing)CASE <Informatik>Component-based software engineeringInformation2 (number)Group actionLibrary (computing)Complex (psychology)Internet service providerOperating systemHydraulic motorFront and back ends1 (number)WindowGraph (mathematics)Finite differenceInstallation artPower (physics)ArmFunctional (mathematics)Endliche ModelltheorieDirectory serviceSoftwareLine (geometry)Message passingMetadataStatistical hypothesis testingAbstractionMathematicsComputer hardwarePoint (geometry)AlgorithmCompilerComputing platformComputer filePhysical systemoutputPiDescriptive statisticsLinear regressionCone penetration testComputer programScripting languageData dictionaryThermodynamisches SystemOperator (mathematics)Meta elementMedical imagingRemote procedure callMultiplicationDifferent (Kate Ryan album)Graph coloringAndroid (robot)MereologyBuildingValidity (statistics)Letterpress printingComputer animation
10:08
Thermodynamisches SystemMeta elementAlgorithmCodeBenchmarkComputer hardwareSimulationCollaborationismComponent-based software engineeringUser profileSet (mathematics)Computing platformLinear regressionComputer programRevision controlLevel (video gaming)Different (Kate Ryan album)Statistical hypothesis testingServer (computing)TunisPiWebsiteVirtual machine
11:24
Machine learningDynamic random-access memoryProgrammable read-only memoryMaxima and minimaMusical ensembleExecution unitAnnulus (mathematics)Random numberSharewareInternet forumLimit (category theory)CalculusTraffic reportingGraph (mathematics)Thermodynamisches SystemSoftware bugRepository (publishing)SharewareInformationPoint (geometry)Virtual machineDifferent (Kate Ryan album)Graph (mathematics)TunisGroup actionCompilerComputer animationSource code
12:43
MultiplicationQuantumStatistical hypothesis testingDifferent (Kate Ryan album)Parallel computingSet (mathematics)Server (computing)Computer programThermodynamisches SystemLinear regressionSharewareGroup actionRepository (publishing)Directory serviceElectronic mailing listComputer networkMultiplication signInformation security
14:08
EmulationHash functionCompilerInterior (topology)Inclusion mapTorusLibrary (computing)Computer-generated imagerySoftwareElectronic data interchangeHill differential equationConvex hullHydraulic jump12 (number)Process modelingBinary fileComputing platformGEDCOMStatisticsComputer programMultiplication signResultantThermodynamisches SystemWebsiteComputing platformBenchmarkRepository (publishing)Student's t-testComputer fileModul <Software>Medical imagingMachine learningTunisTraffic reportingAutomationServer (computing)InformationCompilerDifferent (Kate Ryan album)MetadataMereologyPlug-in (computing)BackupVirtual machine2 (number)SupercomputerFunctional (mathematics)Endliche ModelltheorieTransformation (genetics)Staff (military)Gcc <Compiler>Linear regressionSource codeComputer animation
17:52
Open sourceAutomationSupercomputerWeightService (economics)Menu (computing)Thermodynamisches SystemCollaborationism8 (number)Component-based software engineeringUser profileSet (mathematics)Computing platformRepository (publishing)Portable communications deviceAssembly languagePerformance appraisalTask (computing)Online helpPairwise comparisonEndliche ModelltheoriePrototypeRadio-frequency identificationOpen setComputer-generated imageryCompilation albumTournament (medieval)Convex hullIntelGraphics processing unitCompilerInclusion mapMobile WebStack (abstract data type)MultiplicationConstraint (mathematics)InferenceFirst-person shooterProcess (computing)Mathematical optimizationMIDIModem1 (number)Presentation of a groupConic sectionVideo gameRow (database)Mathematical analysisDigital libraryLatent heatPresentation of a groupReal numberMedical imagingCodeLink (knot theory)Server (computing)Software frameworkTask (computing)Power (physics)Software bugPhysical systemThermodynamisches SystemEndliche ModelltheorieLibrary (computing)2 (number)Tournament (medieval)ResultantComponent-based software engineeringCheat <Computerspiel>Computer hardwareForcing (mathematics)Validity (statistics)Overhead (computing)Rational numberOpen setSoftwareUniverse (mathematics)Virtual machineGraph (mathematics)TunisComputer animation
21:48
Stack (abstract data type)Task (computing)Computer-generated imageryFood energyRead-only memoryPareto distributionSoftwareDifferent (Kate Ryan album)BitYouTubeEndliche ModelltheorieTask (computing)CASE <Informatik>Hydraulic motorComputer hardwareInformationFood energyComputer animation
22:36
Set (mathematics)BuildingSupercomputerFunction (mathematics)AutomationStudent's t-testIntegrated development environmentComputing platformScalable Coherent InterfaceProof theoryQuantumAlgorithmQuantum computerVirtual machineWebsitePrototypeOpen setCollaborationismMeta elementComponent-based software engineeringTask (computing)Price indexMultiplication signGraphical user interfaceGoodness of fitTournament (medieval)Virtual machineProof theoryCartesian coordinate systemQuantumCollaborationismQuantum computerStandard deviationPrototypeSupercomputerSlide ruleCASE <Informatik>ResultantLocal ringDigital electronicsSeries (mathematics)Different (Kate Ryan album)RankingComputer animation
24:31
Software frameworkComputer animation
25:04
SimulationCodierung <Programmierung>SoftwareCompilerSupercomputerNeuroinformatikComputerExecution unitComputational scienceRight angleMathematicsMereologyCASE <Informatik>ArmGodOpen setProduct (business)Goodness of fitMusical ensembleProjective planeDistribution (mathematics)Multiplication signSoftware frameworkComputer animation
26:41
Canonical ensembleComputer animation
Transcript: English(auto-generated)
00:05
So hi, everyone. So my name is Gugger Forsson. I'll try to speak up. If you don't listen, if you come hear me, just raise your hand. So I'm very excited to be here, first of all, because it's my first FOSDEM. And it was really very cool to see many interesting tools
00:21
out there, many interesting talks. And I'm also excited to present you this new community project, which is called Collective Knowledge. And I think it's a bit of an alternative view from a researcher perspective. So we have all those cool tools out there. But as a researcher, I want to kind of use them. And I'll tell you what are the possible problems
00:40
and solutions to do it. And in the past year, I was very glad that we worked with some great companies, universities, or nonprofits. And they provided lots of feedback for helping them to use CK. And they were providing some interesting. They were doing some cool things, like performance regression testing, crowdsourcing experiments, and even generating some, automatically generating reproducible articles.
01:03
So I'll mention this. And again, last year, while working with those organizations, we actually validated our approach that it works with most of the tools out there. So it makes me quite happy. But what is it all about? So in my past life, I was a researcher.
01:21
And I was working on machine learning systems and compilers. And basically, when you have some interesting ideas and you want to implement it, so I'm trying to look at different papers being published there. And last few years, there were about thousands of papers being published on machine learning. And actually, someone told me when I had those slides
01:42
that, in fact, last year, someone mentioned there were 10,000 papers being published just on machine learning, if you count reports, blogs, and so on, 10,000. So as a researcher, I want to validate them. And most of them actually didn't share code and data. So what do I do? So OK, when I was young, I thought, OK,
02:00
I'll start looking at some of the papers. I will learn great tools which you are doing, which you are working on. And I learn. I look at all the different tools out there which are really helping you to simplify your life. I continue learning them. I become old. I have white hair now. And at the end, I manage to implement my idea.
02:21
And when I implement my idea, I suddenly realize, oh, TensorFlow API changed. My new CUDA doesn't support my new GCC, which is out there. So what do I do? I start learning again. Or I just quit. I get depressed. Or some people, like Kenneth, try to build to create some cool tools like EasyBuild, which is great. So whether it's, I think, a minority.
02:41
So we know all those problems. And so I treated them 10, 15 years ago. But last five years, everyone was speaking about it. So one of the ideas is about, like, everyone buzzwords. Let's do reproducible research. Fantastic. How do we do it? Five years ago, we started introducing what we call artifact evaluation at many different conferences.
03:00
And the idea was that when you publish your paper, you can voluntarily submit your code and data to some committee. And we'll try to validate your results. And it worked out very well. From the first conference, which we did, like, about five years ago on P-POP, we got five artifacts out of 25 papers, which was still, like, a little.
03:21
But it worked. We tested methodology and so on. Last year, we got about 70% of papers that P-POP are submitting code and data. And it's a good thing. Now, what is bad? So when we started looking, of course, there is still no methodology about how you share and reproduce all those complex experiments, performance evaluation, and so on.
03:41
It's really still in infancy. And what is really ugly is that over the past five years, I looked about 100 artifacts in all those 100 papers and top conferences. And they all have 100 DevOps scripts to do some experiments, to download some models, to hardwire your pass. If you want to change your pass, you need to find the script somewhere, fix it, and so on.
04:02
And it's really, so all our evaluators, we spend more time, actually, on trying to figure out how to deal with all those scripts than actually doing fun stuff, evaluating your result and validating it. And at the same time, even though we have all those code and data shared, when your lead researcher is leading, or your student or whatever developer is leading,
04:21
all this, no one is using all this code and data. So it's really like, for me, I'm really, it's a shame that it happens, because there is so much interesting stuff out there. And of course, the latest thing is like, oh, okay, so let's use containers. And these are fantastic tools as a kind of end solution, as a snapshot. But again, they're hiding the mess underneath.
04:40
Someone still have to solve this mess. So it's not solved at all. However, this experience with all those conferences, when I was looking at all those artifacts, I started realizing that all those hundreds of artifacts, they do the same things. All the time. And they're simple things. So if you look like, what do you usually do? As a high-level algorithm, right,
05:01
I would say that I have my program, image classification. I want to compile and run it with some, I don't know, I would look at some different images out there. I would want to kind of adapt to my software, and usually I would try to detect that I have some GCC or LLVM Intel compilers available. I would try to find somewhere a data set, usually again, with some hardware parts and so on,
05:22
but it's still the same. And then I would run experiments, collect performance stats, produce some graphs, validate them, and print and create your paper. And thus, all of them are doing the same. So what is happening? So this started like, when I discussed it with all our colleagues, I was thinking that we're missing APIs, common APIs for all those tasks.
05:41
That's all. Can we come up with some simple APIs which will automate all those tedious tasks which we have so that everyone reuse those APIs? And they must be simple. And what is very important, and also you can provide some meta information, meta descriptions for all those components which are out there so that everyone could use them. And the big point for this is that now,
06:01
if we have APIs and those meta descriptions in some color format, we can enable DevOps, because this is like, why DevOps are not there in research, because you don't have APIs. How would you connect your Jenkins or whatever, try this with all these scripts out there? And that's how collective knowledge came into play, so we started, let's create just a tiny Python library
06:20
which will provide you those human-readable modules, Python modules with human-readable functions. And they will provide you an access, so this will build this, like, program module with function compile run, and it will have a dictionary or JSON input and always JSON and dictionary output, because it's very easy to extend them. At the same time, all the data which you share,
06:42
you would also have some kind of unified meta JSON. And what CK is, just your common line front end, to just call those modules. For now, that's all. And if you, like, I know that when I was talking to researchers, they were like, oh, but it's too simple, we don't like it, we want complexity. But being in the industry for a long time,
07:00
I said, no, opposite, like, the simpler it is, the better it is. And that's why we had many fights, but okay. And then now, when you have all those components, the more components you have, you can now start assembling all the experimental workflows from those components and do more and more complexity, do some fancy stuff, and again, I'll tell you later, some more fun stuff that we're doing with that.
07:21
And at the same time, when we were discussing with our colleagues, we said that, okay, let's now, when you share, like, your code and data through GitHub, let's just have some common format, and usually, it's just very high-level information. So whenever you see this collective knowledge compatible badge on some GitHub repository, around 100 now, they have common information. So first directory usually tells you your module,
07:42
Python module API, so that you know that your further data is abstracted by this API. And then second-level directories, you have basically what we call CK entries. This is your data, which can be anything. From your soft, in your soft abstractions, you would describe how you detect some software, and your program would describe
08:00
what their dependence is, and so on. Again, very high-level. Later, if you're interested, you can look at all the information on the website, but it's just trying to give you very high-level idea about what we did. However, of course, it's not a magic, so someone still needs to implement all those APIs, and this is kind of was a tedious task. So in 2017, we got some first adopters,
08:20
so it was ARM and General Motors. And ARM, of course, they develop all those, like lots of hardware, different algorithms, they need to provide big performance regressions with testing. They have multiple work groups working together. So I was thinking that maybe they can connect all their work groups with the same framework, and the same for General Motors. So we started gradually adding those APIs ourselves,
08:43
or with our colleagues. And just again, what were the four first APIs which were provided? Very simple, again. First of all, what do you want to have? You want to describe the operating system. How you, I don't know, compile some program, how you can find some data there, and so on.
09:02
And this can be between different operating systems. So it can be Linux, Mac OS, Windows, ARM was contributing lots of stuff for Android. Then the same thing, what we always do, detecting platform information. Again, notice that our API, afterwards, they call different tools out there, which you develop, but we're just providing a common API so that everyone else is protected
09:21
from the changes in the system. The same installing, whenever you have, we want to adapt to your native environment. So provided many APIs and many kind of what we call plugins for detecting software. There are around 500 software detection plugins. So basically, you can just say cksearch soft with either LLVM, Dataset, and so on.
09:41
And it will find you all the installations out there. And the same for packages. There are great tools out there, EasyBuild, SPAC, and some other ones. So we provided an API for those kind of tools so that anyone can share their even more high level recipes about which tool you want to use and what are the dependencies. So if your software is missing,
10:01
you can automatically call and install a missing package for the tools available out there. And just the last thing about those like very high level thing about this ck, because it still probably looks like magic. We saw 20 years ago, I was working on performance autotuning. And basically, last year, I was talking
10:22
with my colleagues from Raspberry Pi Foundation with Eben Aptan, and was saying, ah, why not to like kind of convert and create a program workflow which would do performance regression testing on different GCC versions on Raspberry Pi devices. And we created this workflow which basically was a work
10:41
which I did 15 years ago in my PhD. We did it in about a month. And then we ran experiments. We crowdsourced experiments. So the same workflow, now different users from Raspberry Pis, they were running the experiments on their machines, and they were collecting data on our Synology.org server. And suddenly, like I think one week, I collected so much data,
11:00
which I think I didn't collect in five years of my PhD, which on one hand was kind of a good experience last year. On the other hand, I was crying that I spent five years on my PhD, like collecting, just collecting this data. And I could do it in five months, so. And now I can do it in 10 minutes, because I was telling you that I have 10 minutes. That's fine. Okay.
11:21
And the cool thing that we created, if you look on the website, with Raspberry Pi Foundation, we created an interactive report which is automatically generated also through CTA. And if you look at here, you will see all this, what we did. It generates PDFs with all the papers,
11:40
with all the graphs out there. And I just want to go down somewhere. So here, in fact, you will see all the way how we reproduce experiments. You can do it yourself if you look through this paper. But what is fun now is that because we have APIs, now we can actually embed some underneath this interactive graph. We can embed some interactive graphs,
12:03
which if you click on it in your paper, you now go, oops, sorry. No clicking. I don't know, some cool point. This is like performance tuning. You'll get to the repository and you will see different information where this experiment ran, what was your performance. You can view all the compiler flags, which were there.
12:22
You can replay it on your machine, because it's, again, all done for replay. And different work groups can do it. And now just I have still, I have still eight minutes, 10 minutes. I wanted to show you a small demo, just to show you that you can do it yourself now with a few clicks. You can actually start participating in this crowd tuning. And from your machine, from any machine,
12:42
maybe there will be some bug or something, but let me check if it'll work. I usually don't like to do live demos because they don't often work. Can you see the font? Everyone's okay? No? Too small? It's too small. So now I'm losing time.
13:00
To get this, but okay. Ah, yeah, it's better.
13:28
So now I'm on one of our remote servers where I should perform some regression testing, performance regression testing. And we share those CK repositories. And again, what I do is that you would see
13:40
many CK repositories out there, which are usually GitHub repositories, which will pull on different work groups or communities, work in parallel on those. It's a distributed system. They work in parallel. And what I will do now is, so I have many programs here, many data sets here. CK list.
14:01
What I will do now is that what if I do the following? So all the repositories on the CK, big CK directory. What if I'll do the following? What if you do this at your website or for your user in the HPC center? What they will tell you if I would remove all your repositories?
14:20
And I'm doing it now. And I hope that people will not fire me now if they're using this website or this server, because now I'm usually they're reusing my server. So I have, I think, one minute to kind of restore it. So now, when I do CKLS program, I don't have anything anymore. I don't even have this module.
14:40
So what do I do? So I start saying that okay, let's pull the repository with scroll tuning. And it should normally go to the GitHub, yeah, where it was shared. And it's now also pulling all the sub-repositors, all those APIs, different APIs out there. I might need to rebuild you the workflow to do this interactive paper or this experiment on performance tuning.
15:01
So I still have like 30 seconds, I think, to restore it. And I'm checking if my phone is, no one is ringing me, but not yet. And I don't, well, so now I have the CK report. So now I again, let's check. So I have CK program. And now what is interesting that I can just compile a new program out there.
15:23
Susan Conner's Automated Benchmark. And what happens is that it starts detecting all the dependencies out there. I don't have time to show you how it looks. So you can go on GitHub and you will see all the meta information about those programs. But now I'm going through plugins to detect all the dependencies. It needs compiler. So we detected some mental compiler.
15:41
We'll go through GCC. Still have 10 seconds. So let's install, I don't know, like some GCC. And each time it's interactive, it tells you what is available on your system. So we're adapting to your system. Plunk, okay, it found. So now it found all the, it detected all the dependencies. It tells you how do you want to compile a program. Let's compile with GCC compiler.
16:02
We compile it. We run the program, the same program. The first time, because I delete everything, I need to tell that my platform is Linux. It can be done automatically. So it's important. And now I'm running corner detection of some images. Notice that all those images, this again,
16:20
I pulled someone else, pull a new repository, and I'll have like hundreds of those images from someone else. I don't need to substitute all the parts. CKE automatically finds all this for you. I run the program. I run it 10 times. I perform some statistical analysis. And this normally works for 10 seconds, yeah. I perform, I provide some performance information.
16:42
Notice that when you use Python to call CKE, you will have actually a huge JSON file internally with provenance. And now just to start participating in this performance regression, I just do CKE, crowd, tuned, program. And it asks you all different plugins,
17:01
what you can tune, you can tune models, you can tune compilers, you can tune whatever you want. Just to watch a few things. And now it starts tuning your program and sending results to our server back. So in fact, this reproducible article, actually in a few days I can even get more results. And this will be a live article because I will be getting your results. I can apply machine learning. So something what I was craving for 15 years ago.
17:24
Okay, I'll stop this. So just the idea is to tell you that first of all, I rebuilt all this workflow. Now everyone still has access to all those repositories and it works. So it's very quick. Imagine now if I delete all my stuff and then it takes everyone a huge amount of time to restore. I saw some PhD students who actually quitted PhD
17:42
because their server died and they couldn't reproduce the restore stuff. Their backup was not there and so on. So now it's very easy to reproduce stuff. And just going back to, I have five minutes. I just wanted to tell you a few things actually of our companies and universities, how we use them. How we now use CK. Just again, to give you an idea
18:01
of what may be useful for you. I think it's switched, sorry. Yeah. So I started, last year I went to Seattle and I was talking to my colleagues at Microsoft and the University of Washington and we said, oh wow, but this cool stuff that now
18:20
you have all those shared components, we can actually enable open science. It's again a buzzword, but what do we mean? Now instead of even publishing your paper, validating it and then like accepting it, why not to tell people that you share your CK workflow, workflow for a given task? We'll be validating it, we'll have a real life scoreboard, or I don't know if you claim that you have performance analysis.
18:41
We'll check it and then we'll accept the paper. And with those guys, with Washington, Cornell, a few other universities, Cambridge, we created this tournament last year, Task Plus, and we said we took very simple tasks, for image classification, everyone knows it, and we said anyone can submit any solution for image classification, can be software,
19:01
hardware, model, and just show us what will be your performance, throughput, accuracy, and we'll put it on scoreboard and we'll accept it depending on the results. And this was quite a good success so we got very diverse artifacts, we had some cool stuff, we had submissions with someone who was providing a cluster
19:22
of 10 Raspberry Pi devices and showing that they can do the same speed of image classification as Tegra and TX machines, so it was really cool. On the other side we had some submissions from Intel provided a submission on using some powerful little server
19:41
in AWS and Amazon Cloud. And just one month before they published a paper claiming that they got a record number of record throughput, like 450 images per second on this server, and it was 50x speed up over traditional Caffe framework which they used.
20:01
And we validated it, so we plotted all these graphs, so again you can go there and you will see in this link all this live data from this tournament. And what is interesting, that this result from Intel, we validated it but it was not easy. So when we started looking at their code and data was shared, it wasn't with CK, so we started validating it.
20:20
Ah, we are not getting 450 FPS, we are getting like 50. And this is how when you read the paper and you don't know all the data, that's what happens to you. And then you say, ah no, I don't trust those guys, they're cheating, it's really not nice. But then because it was validation, we started testing it. Ah, library was missing, they used very specific Intel library with DNN and MKL, which you had to use it.
20:42
We got it, we got like 200 FPS. It's already better but still not 450. We continued digging in, I didn't realize, and by the way, those guys were extremely supportive and working together to kind of fix the problem. And then we found the bug that they had this model where it was supposed to be in T8, it wasn't in T8, it was still in P32, so we fixed it.
21:03
I don't know about 450 FPS. We fixed the workflow, we fixed all the dependencies, we shared this, so in ACM digital library you have a paper with those artifacts. And what is even cooler is that one month later Amazon colleagues called us and said, wow, we saw this workflow.
21:21
And actually in two minutes we did it in our Amazon Cloud and it worked, and it actually gave us the results. Furthermore, we ran it another workflow, we had FPGA submission on FPGA and it also worked. And we had a common presentation a few months ago, you can see the details, where it's kind of, now when you publish paper it's very quick to reproduce results on a native environment.
21:42
You can use Docker, it's fine, but you can now actually use it with different systems and it can adapt to your environment. In just two more minutes, I'm just finishing going through quickly. General Motors, what they say, they work on self-driving cars. So they said, oh, they usually have to find the good solution, it's again, public information, you can see it on YouTube.
22:02
So I'm not saying any secrets. So like any other self-driving car, they need to find the best solution for model software and hardware, which would be, let's say, less than $100. You can't use like $1,000 cheap in your car because it's a little bit expensive, it has to budget on, I don't know, 10 watts of energy,
22:20
has to be accurate so that you don't kill a pedestrian because there was set cases like this, as you know. So this is all very complex task. So they say, oh, but now we can actually use those workflows and we can run them on different software and hardware, and we can find on the Paretta the best solution. So this is again, this helps them. I'm reproducibility vice chair for supercomputing,
22:40
but trying to see if we can automate different submissions and we were doing this proof of concept, say, sole application. You can look at it, how we automated it, and how we can run it on different supercomputers. And okay, I'm finishing the last thing
23:01
because time is up and I have one minute. I said, well, so I did the quantum tournaments just one week ago in Paris, and it was a good turn up, and you can see if you are dev room at Quantum Computing. We're actually reproducing work with IBM Rigetti, and we're producing some of the results, and we had some interesting new submissions about improving machine learning techniques,
23:22
so you can see it online. Time is up, I have last slide. So now you can say, okay, it's all magic, it's all solved, not at all. And there is, it's like, well, only at the very, very, very beginning of this journey to tell people that we can do reproducible research. And I think that CK is just probably the first prototype,
23:43
I consider it still a prototype, which allowed us to connect all those different tools together, what we are building, and allow researchers to actually take advantage of them without spending their time of understanding what you are doing, so providing dyslexic API with your tool, and hopefully you can plug it in. But there is a lot to be done, and we are open to collaboration
24:01
because we are looking at standardization of APIs, we are improving like installation, GUIs and so on, providing more APIs. With that, again, I said I'm open to collaboration, contact me, and thank you for your attention. I hope it will be useful.
24:21
I'm sorry that I was speaking fast, because I was like, and we have even time for questions. Actually, go to 10.40, which is like exactly, oh yeah, sure.
24:59
Okay, so the framework you call Luigi,
25:02
I don't know this framework, unfortunately, so the question was if we considered using tool Luigi, right, again, I will look at it, and which apparently can help if you change small parts, it can help you trace it and so on. So no, we didn't.
25:20
So far, what we do is when someone changes something, since it's a common workflow, someone else complains and says we had a case when we broke something, an arm told us like in a day, like oh my god, there is something broken. So it's an open project, just every project. So there can be tools like this which we think can help make it more stable now.
25:41
So I will look at it, thank you, but I don't know this tool yet. One more question, probably there is one more question, I don't know, yeah, please. Oh, they are using production now. So yeah, yeah, so, oh yeah, do we have any, oops, sorry, I'm falling down.
26:01
Do we have any experience using those workflows in production? So they're using production. So, and as I said, so, I don't know if I can officially start saying which companies are using it, I will prefer not to for now, even though you saw some of Luigi kind of using. Yeah, we're using it all the time in production. Actually, why you didn't hear about this framework
26:22
for the last few years, because I was actually working a lot with companies, trying to make, understand what they need, and they're using it. And now I kind of making it like public. So, but yeah, yeah, absolutely, it's my main thing is we're working with companies. Okay? So thank you very much.