12 Factor Apps for Data-Science with Python
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 9 | |
Autor | ||
Lizenz | CC-Namensnennung 4.0 International: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/62307 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
6
9
00:00
TeilbarkeitProzess <Informatik>ProgrammierumgebungÄhnlichkeitsgeometrieDienst <Informatik>SchnelltasteKontrollstrukturVersionsverwaltungDatenverwaltungTaskEreignishorizontWeb logLokales MinimumKonfigurationsraumHierarchische StrukturSoftwareentwicklerEreignishorizontAggregatzustandVersionsverwaltungDienst <Informatik>InformationLoginApp <Programm>Prozess <Informatik>TaskNeuroinformatikPhysikalischer EffektBeanspruchungSystemverwaltungGüte der AnpassungDatenverwaltungInstantiierungNetzbetriebssystemStellenringFlächeninhaltMehrrechnersystemAnalysisURLKartesische KoordinatenPunktHydrostatikStrömungsrichtungSoftwaretestMusterspracheServerInzidenzalgebraMatrizenrechnungOpen SourceUltraviolett-PhotoelektronenspektroskopieVerknüpfungsgliedDifferenteVorgehensmodellProxy ServerVirtuelle MaschineHalbleiterspeicherDatenbankSystemplattformWurzel <Mathematik>Pi <Zahl>Produkt <Mathematik>Klon <Mathematik>Web ServicesStabBitTeilbarkeitProgrammierumgebungDeklarative ProgrammierspracheAbfrageStreaming <Kommunikationstechnik>ZweiSoftwareInterface <Schaltung>SchlüsselverwaltungAlgorithmische LerntheorieElektronische PublikationCodeMereologieRechenschieberThermodynamisches SystemNotebook-ComputerE-MailDokumentenserverNP-hartes ProblemGruppenoperationZeichenketteAusnahmebehandlungInformationsspeicherungFahne <Mathematik>MultiplikationsoperatorTransaktionEinflussgrößeKontrollstrukturMigration <Informatik>FrequenzMAPBefehlsprozessorHierarchische StrukturSichtenkonzeptAlgorithmische ProgrammierspracheRechter WinkelKanalkapazitätDatenflussData-Warehouse-KonzeptEndliche ModelltheorieSoftware EngineeringEntscheidungstheorieVorhersagbarkeitFigurierte ZahlGenerator <Informatik>ZentralisatorKlasse <Mathematik>Demo <Programm>AlgorithmusSpeicherabzugStabilitätstheorie <Logik>PunktwolkeOffene MengeWeb SiteMeterIn-Memory-DatenbankDichte <Stochastik>KonfigurationsraumTypentheorieSoundverarbeitungsinc-FunktionKontextbezogenes SystemCluster <Rechnernetz>QuellcodeMessage-PassingSystemaufrufMinimumFramework <Informatik>TemplateSystemzusammenbruchAutomatische IndexierungPhysikalismusMultiplikationZeitreihenanalyseSchnittmengeObjekt <Kategorie>Nichtlinearer OperatorTwitter <Softwareplattform>Coxeter-GruppeWeg <Topologie>HypermediaOrdnung <Mathematik>TupelTurbo-CodeQuaderAutorisierungDatensatzFormale SpracheGrundraumWellenpaketSinusfunktionBinärcodeClientTopologieDateiformatTechnische InformatikExogene VariableKondition <Mathematik>Inverser LimesLuenberger-BeobachterForcingMobiles InternetFreewareElektronisches BuchAutomatische HandlungsplanungMathematikZentrische StreckungATMKette <Mathematik>Gewicht <Ausgleichsrechnung>KreisbogenZahlenbereichGraphHeegaard-ZerlegungKernel <Informatik>BootenCASE <Informatik>Gebäude <Mathematik>Vorlesung/Konferenz
Transkript: Englisch(automatisch erzeugt)
00:00
Hello everybody. I like, thanks for the nice introduction. Um, yes, my name is Peter. Um, I've been working with Piles now for mostly 20 years and that's amazing. Uh, and I'm still, when I look at pipe and code, it's like the first time,
00:22
I really remember it in an AI course at the technical university of Berlin. I still love the cleanness of the language and I still enjoy every day I can work with it. So that's probably also why I'm not only using Python, but I've been quite active in the Python community over the last years of
00:42
giving talks at Europison at PyCon and PyData. And last year with my partner in crime, Alex, and some others, we have been co-organizing PyCon DE in Karlsruhe and because it worked so well and made that much fun. Uh, we are going to do it again this year. Um, so if you plan to come over, it's from here about three hours with the train.
01:05
Um, just the follow us on Twitter at PyCon DE or visit the website where we will publish more information, uh, part by part. Um, we, last year we had about three tracks and one session with tutorials and there were about 450 people. Um, we were headed in the set KM,
01:23
the central for art and media and Karlsruhe. It's a really a beautiful, uh, venue. Oh, I can just invite you to visit us in Karlsruhe again. So, um, what do I do for living? I'm a senior software engineer at Blue Yonder. Blue Yonder is a startup with now about 120 to 150 people and about 70 data
01:47
scientists. Uh, we offer machine learning through software as a service mostly in the retail sector. So that's, we calculate demand predictions, order proposals for our customers, big retail chains,
02:02
and also we do dynamic pricing, mostly in online retailing. Um, the last order, normally my day to day work is a work on our internal platform supporting our data scientists to get data science into production. Um, and the last year I've been mostly busy, um,
02:20
porting all of our stuff to Microsoft Azure because management decided that we want to move from self-hosted servers into the cloud. Um, what does a typical, uh, day, uh, in our stack look like? So, um, as an example, machine learning approach for replenishment, um,
02:44
early in the morning, uh, we get new data from the customer. So our customers, big retail chains, every day they send us the sales from the day before the current stocks levels, uh, the opportunities from their suppliers where they can get new stuff in.
03:01
Um, all this goes from API into our systems, uh, that we do data cleaning, checking for examples, if they deliver a sales, do we also have locations and products for this values? Do these values make sense? Um, after this, um, all the data is inserted into a central data warehouse. Um, and from there,
03:22
um, we do feature generation for our machine learning models. That means, um, from an, um, star or snowflake like schema, uh, we build huge matrices with features for the machine learning. So for example, for a product location day, how much was sold in the past and we add extra features like what was the
03:42
weather in the, um, at this day, and what was the price of this product in this location? Was there a promotional effect for this product in this location? And with this features, um, we train our predictive models and in the daily run, the machine learning model just spills out a probability density function for
04:05
product location, day tuples. So, um, we say, okay, in, uh, 50%, um, we expect that you tomorrow will sell two of the items in this location. We do this for multiple days in the future and then we can take this data and
04:24
tell our customers, okay, if you want to, um, satisfy your demand, um, you need to order in this location, for example, 500 bananas because we think bananas is the new hot shit and you will sell a lot of them. So, um, maybe you see, um, the small gray box in the bottom.
04:43
So basically that's me and my team working on, uh, bringing the stuff, the data science into production and automate all this stuff. Um, I'm mostly talking about this, the whole talk, but, um, you should think about, uh, the business of our company is the machine learning and that's the important
05:03
stuff. So we are there to support, um, our machine learning specialists. Um, maybe sometimes I talk a little bit different because I also think my stuff is important, but, um, keep in mind the machine learning is important stuff. The others, um, it's just the supporting role. So now what does it look like? Everything works fine. Um,
05:24
our customers are happy. The retailer's happy. He has fresh food. Um, he has nothing to waste in the evening or because he has good predictions. Um, he has low, um, stock value, so he has not that much bond capital. Um, but he also doesn't go out of stock in the evening. So if you,
05:42
maybe you know it from this counters, if you go in there late in the evening, everything is gone. You don't get the stuff you want. Um, if you take our solution and everything works, it looks like this effect. Um, so I could stop now here. Um, if everything works, it's easy. Um,
06:00
sadly or it's not that easy. So, um, I'm going to tell you a little secret. Um, keep it for yourself. Stop the recordings. Uh, data scientists are different. Um, they, they are not, um, like our, um, ops guys, log engineering guys. Um, so we have lots of them in our company.
06:22
I mostly have, they have a PhD in particular physics. Um, they are very bright people. Um, they have a strong academic background. Um, they love iPad notebook. Um, they love when stuff runs on their machine. They are really into, um, getting the best algorithm for their problem.
06:41
But as soon as they solve the problem on their machine, they move on. They don't have any interest in getting it into production and running in every day running stable. Um, so that's where we somehow need you to support them to find a way to really get this stuff done every day. Um, I'm going to tell you a small,
07:03
small story. Um, once in a bit of really big customer, we had problems with our demand predictions. So, uh, we figured out, okay, we have a bias in our demand predictions, but we didn't figure out what was the problem with the model. So what did the guy who figured out what, um, that we have this bias,
07:22
he was able to calculate a correction factor in his iPad notebook on his laptop. And for 10 days in a row, every day at 12 o'clock, when the automated daily run was finished, he logged into his iPad notebook and changed the data in the production system
07:41
for millions of predictions with his correction factor. Um, this is the worst thing that can happen if you, you have this amazing machine learning pipeline, you have all this automation, um, you have lots of, uh, guys who put much thought into automation and suddenly are dependent by one guy
08:01
that he doesn't come back too late from lunch, uh, to fix the predictions. So that's not what we want. Um, so that's, um, what's really the sounds funny, but this is mission critical. Um, if he types in the wrong value, um, suddenly, uh, every retailer or this specific retailer has doubled off the amount of bananas
08:24
in his stores and all the store managers say, well, what, what should I do with all this bananas? Um, so we try in the platform team, um, we try really to enable our data scientists, but also try to free them as much as possible, um, from stuff they don't like to care about. That's scaling, that's logging,
08:43
that's monitoring, that's running stuff into production, but also giving them an environment that they feel like using Python or feel like doing machine learning, that it's easy, easy that they can accomplish amazing stuff. Um, but we always need to think, okay, how can we get this into production? Um,
09:04
since we've been, our company has been founded about five or six years ago. Um, so we, uh, have built most of our stack, our own, we use lots of open source, um, but we still operate our own own computing cluster. And, um,
09:21
even after the switch into the cloud, um, we now use virtual machines in Azure. Um, but we still have our own tooling on top of it. Um, we have been influenced a lot, uh, by the 12 factor app manifesto. Um, that's been, uh, um, right up from the Heroku developers,
09:40
I think in 2014 or something or something. Um, and it's been their best practices on how to run and scale, um, mostly websites and web services. So this does not fit 100% to data science applications, but it really influenced lots of our decisions. Um,
10:02
when we build our own platform, so I think this is something engineer says, okay, um, have everything in a code base, tracked in revision control. Yeah, that's a no brainer. Um, still not everybody works this way. We have heard lately, they are still CS, uh, CS, V, uh, TV, CVS, uh, in production. People mail top,
10:21
top balls to each others. So that's the foundation. Okay. Get it into revision control. Um, one thing that's not always that easy in Python or still there's some discussion around it. So explicitly declare and isolate dependencies. I'll come later on how we do this. I think it's most important that what you deploy to production is really pinned
10:43
down to the last dependency and that we really know what is running in production. Otherwise you'll never be able to fix things. Um, another thing that's mostly for, for people doing open source development, it's also clear that you never check in your credentials in your GitHub repository. Um, if you have internal repositories,
11:02
suddenly that's not that clear anymore. Um, people, um, hard code, database strings or logging servers into their application. So that's also a no go split. You are, um, code from your configuration. Uh, think of it about chemicals. If you put them together, they will explode. Just do it in production. Okay.
11:24
At the last step, um, from the one side that comes to code, from the other side that comes to configuration. Um, 12 factor apps. I'm mostly about a stateless apps. Uh, that's the easy part of scaling your application. Um, you can shut them down, spin more instances up. Um,
11:43
but especially in data science, uh, you always need data to do the data science. So where do you store the data? Mostly in databases or in blob storages. Um, so you need a way to have backing sources, backing services, uh, attached to your application. And in the best case, um,
12:01
you are able to switch them on and off and to exchange them. And if you separate state and the stateless application and are able to change the resources that you attached to application, then you are much, then your operation, the daily business will be much easier. Um,
12:21
another mostly no brainer if you are a little bit into software engineering, um, that you, um, strictly separate, build and run stages. So you have a Jenkins or something like that where you test your application at the end. If all tests are green and you have a deployable, um, that's mostly a top all a zip file or pipe and wheel.
12:44
Um, and then you can run this deployable in different stages. What are stages you can have a test environment where you can put your deployable to test. If everything works like you expect, you can have a staging environment where our customer can integrate against. And if all these gates are past green and it works and then you can put the
13:01
deployable into production. Yes. Um, the six, um, factor of the 12 factor ups is execute an app as stateless process. Um, if you don't have state, that's fine. If you have state you need to externalize it. Um, we use lots of internal HTTP services,
13:22
but also dusk services. And, um, you don't want to care about, Oh, when I deployed, which host, which port does it run? So you deploy your application and the environment of the application takes care that you get attached a horse and a port and export this information for other
13:42
services so that they consume, can consume your services. Um, an easy way to scale out is to scale out. We are process model. Um, if you have heard in the talk two talks before, um, Python is not the best at multi-threading. So it's always easier if you have can have multiple processes. Um,
14:02
if you take a data science tools like tasks, um, they will handle this for you. you just need to find a way that your environment can spill up a lots of instances of your dusk workers and then you can distribute your whole tasks to your computing engine. Um, the faster and, uh,
14:21
the more graceful your application start up and come down, the easier that is to switch nodes in your data science cluster and to move certain workloads from one node to another. So if you don't, um, and the best way you don't try to, to, um, use local resources, but always use remote resources and thus,
14:41
thus you can scale up and scale down a much faster, um, about the development in different, um, areas. I've always said something. Um, another thing which is important, especially in distributed environments, um, is that you can't access your logs locally anymore. Um, so you have to treat your logs like a event streams and you have to push them
15:04
from the node where the computation is running to another logging service where you aggregate them and then can provide them to your set data scientists. Okay. And then another thing that you should, uh, really take care of is that admin or management tasks or some tasks like
15:23
database migrations, um, run on your cluster on your end environment that is defined and not from a local machine. So, um, these are the original 12 factor. Um, what does the 12 factor manifesto? Um, the first thing that we decided, um,
15:42
everything that runs in production at Blue Yonder is deployed as a managed service period. Um, we had it before snowflake servers. Um, lots of people have root access to certain servers deploying stuff. You don't know anymore what's running there. Um, we decided, okay, that's not the way we want to go.
16:00
And that's why we build a data science platform internally where you can deploy your staff as services in a managed environment through API's. So no more root access. Um, we are done with this. Um, this is a little bit, um, what our platform looks like. Um, so at the bottom we have lots of physical or virtual machine notes.
16:23
Um, on top of this, um, we have Apache Mesos and Apache Aurora. Um, you could probably, uh, exchange them with Kubernetes. Um, at the time when we decided for the software stack, um, it still looked like Apache Mesos might, um, win the battle. But yeah, and you can think of the Apache Aurora Mesos as a cluster operation,
16:44
the operation system. Um, it's, it borrows much of the concepts from the Linux kernel, but it's not running anymore on one on one virtual on one physical machine, but it spans over multiple machines. Um, but you also have the possibility, um, to give certain,
17:03
um, uh, certain stuff like a CPU, memory or storage to processes to isolate processes, um, to start processes in containers where you can run stuff. But it's a framework that works over a whole district, a whole computing cluster. Um, on top of this, we have identified some services that our data scientists need and we provide
17:27
these as templates to our data scientists. So that's a VSTI service. So you can basically deploy HTTP services with an API call. You say, okay, you want to install this requirements and this is the callable that should be run when the server is built up. And, um,
17:44
thus it makes it very easy, um, to generate or to develop and deploy new services in a defined environment. As I said earlier, you need one shot or cron services, same concept. And then the other three are more for the data science context.
18:00
So our data science can, uh, fill out or spin up their Jupiter notebooks. Um, they can start dust clusters for heavy machine learning and we use Apache airflow as a tool to orchestrate, uh, all this stuff. So what does, um, uh, production, uh, system look like? Um,
18:22
so it's a screenshot from an internal management UI. Um, don't look that much at the UI. This is a real customer, but you see we have lots of services. Um, we have different service types like the airflow service, like web services like in Jupiter notebook. And,
18:40
um, we can attach different resource classes to our services. So if you just want to have a demo or test something, you can get a small instance with one CPU and eight gigabytes of Ram. Um, if you need to do more heavy wide staff, you can move up to 64 cores and I think 260, uh, 56, um, gigabyte of memory. And of course, like the dust clusters,
19:03
you can have multiple instances of your notes, um, that then form a cluster. Um, and all this is manageable either through the UI, but the more preferred way is always deploy. We are our, um, API. So either we provide an internal Ansible playbooks where you can provide
19:24
different services and all this stuff is automatically deployed. Um, so as I said, we've deploy lots of services, um, that form of, or that are deployed for our customer and we use air flow, uh,
19:41
to really orchestrate all this stuff. So in our daily business, there are lots of stages, um, included and before we had or introduced air flow, it was really hard for the first level support or even for the second level support to see what's the current status of a daily run. Um,
20:01
since we have introduced air flow, it's much more easy to see what's going on. You see, sometimes we fail. That's also that happens, but you can pretty fast see, okay, where did the run fail? Can we trigger or we start a certain services, um, and just have a much better view of your overall system.
20:22
So this was really a huge step forward for us, uh, to force, um, data science to provide API's for their model runs and then use this API's via air flow to trigger, um, certain steps.
20:41
So what does a service configuration look like? Uh, that's again a screenshot from an internal tooling, but it's nice to demonstrate what you can, what you can have. Um, we always support at least two versions of, of Debian containers where, um, the data scientists can deploy stuff into. This helps us always migrating so you can,
21:02
don't have one break in time where you are forced to go to a new version, but you have a period of two or three months. You can deploy to the new version. There'll be an eight or Debbie and nine. Now check how the stuff is going and if it's working to the migration and then wait until the next, um, operation system version, major version. The same goes for Pison. Um, yeah,
21:22
we still have some legacy Pison's choose dot seven code, but mostly we are running now on Pison three dot six. Um, and you can have these predefined t-shirt sizes, what we call them as resource classes. Um, but you can also say, okay, I will have a different, maybe some computations don't need that much run, but more CPU's. Um,
21:42
you can also configure this and then the cluster operation system on Mesos takes care to schedule your service on a node that has enough resources, um, to run the service. Um, what do you get for free with every service service you run? So you get a, an end point. Um,
22:03
that's the really the location on which server this service is running and you can also configure static endpoints. So if you, um, shut down one node, um, as Apache Mesos will take care to move your service to another node because it sees, okay, this node went down. Um,
22:22
it spins up a new incident on another node. Um, so the end point would change, but the static end point, um, would always be the same though. That's the pattern of having a co-located proxy beside to your service. And then you can see, um, you have a lock off revisions that have been deployed off your service.
22:42
So you can specify the requirements, which you will want to deploy your service with and you have different revisions. So it's easy when you deploy on your vision, see, okay, something breaks, doesn't behave that anymore. Like it was before. You can switch back to a previous revision and always know what has been deployed into production. Um,
23:03
one as one of the also very critical things, um, especially in the Python world is packaging. Um, if you want to know what's running in production, uh, you need a way, uh, to have good packaging and dependency declaration. Um, what we use internally as a software called deaf pine net. Um,
23:23
it's developed by Holger Krikal from, uh, five book. Um, it's basically an API compatible, a clone to pie pie.org. So it's a package index. Um, we use it in three different ways. So the first way is we really have a set of official and the, um,
23:41
white listed packages that can be used, uh, in our, in our software. Um, so we cash and mirror these dependencies in house so that we don't depend on the external pipeline server. Um, I don't know who of you is aware of the left pet disaster in the JavaScript world, uh, where they removed one package and everything got down because nobody could
24:03
deploy anymore because this package was removed. So to prevent this because all these white listed packages in house and we also, all of our software that goes into production needs to be uploaded to our internal repository. Um, so you test it, you build it, then you upload it to internal deaf pie. And from there it could be it,
24:23
and only from there it could be deployed into production. Um, I'll just give a little bit about, uh, the next, um, it's pretty important for data science, not so much so much for this talks. Uh, and I've colleagues which help talks about this. Um,
24:41
we use three kind of attached storage systems for our data science workflows. So the first one is a highly performant column, columnar store. That's a database, a shared database of a multiple node and in memory database. Um, if you are interested more in how we use it, um, just go to this picon, the talk from my colleague. Um,
25:04
he's also the author of turbo DBC. That's a very fast access of data betas. We are an ODBC the next attached search service that we use or that our data service as data scientists can use as a binary object storage. Um,
25:22
since we have moved to Microsoft Azure, we use the Microsoft Azure blob storage and especially together with the parquet Apache parquet file format. This is in very efficient way to store immutable data. Um, in a data sets, uh, in an object storage and immutable data is very important in data science
25:44
because then you can test different algorithms, different configurations on the same data. Um, so that's also a very important thing. And then we have a simple postgres service for as present actional store. So our data scientists have the possibility to request postgres as a service.
26:05
Um, if you want to see how this works, just also go to our GitHub repository, basically an API where you can say, okay, give me a small postgres instance and in the background it starts a new postgres cluster. If suited credentials has some limits on size. Um, but it's always handy to have an transactional store at hand.
26:22
Um, what we take mostly care of and provide to our users is observability. Um, also ability is a measure how well internal states of our system can be inferred from with knowledge that you can query from the outside.
26:40
Um, if you look at this hierarchy of reliability, that's from the Google SRE book, um, you see, um, if you want to have a good product, you depend on good development, you depend on capacity planning, testing, release procedures. Um, you need root cause analysis, you need incident response, but all this cannot happen if you don't have monitoring in place.
27:02
If you don't have monitoring in place, uh, you don't know what's going on into your system. You can't do a incident response. You can't do root cause analysis. So monitoring is extremely important to have it in place. And that's why we take care in our platform stack to have monitoring provided each service regardless how you deploy it. Um,
27:23
what's the topology, um, that we, that we have. So we have matrix so we can query our services and they give back matrix on the current status of, of the service. Um, we have tracing where you can see through different services if requests flow through the service, um, where, um,
27:42
we request that we got and which, um, previous service triggered them. And of course you have a logging, uh, events. Um, we use gray lock for this, um, to have structured logging in place is always a big plus cause as soon as you start working on distributed systems, uh, you don't know anymore where something runs and structured logging really helps
28:03
you to, um, aggregate, um, information from, from your longing stuff. Um, what does a simple matrix query look like? Uh, so that's for an HTTP or VSGI service. Um, if you query the matrix interface of one of our services, you'll always get this information back. Um,
28:23
it's a key like HTTP request duration in seconds and it has some labels attached like the end point and the method. And then it has a number. Um, we use this information with me, so it's in Grafana. So for me for a service, it's a time series database. It queries all our services. So if you spin up a new service,
28:42
it will be cleared automatically. It collects this data and then you can put alerting in place on top of this data or a built dashboards. This is a daily run. We are just see how much total memory a service has used. And if you're a little bit used to this dashboard to see with a blink of an eye,
29:01
if the service is working correctly or if it's not working, things go wrong. Um, so we have put in place crash reporting sentry. So software developed by, I mean, Ronica, um, he's the main also of ginger to and flask. Um, so he's very well known in the Python context. Um, but you,
29:24
especially again with distributed system, you want to get noticed, but you don't want to get 500 males from your 500 distributed clusters. So you want to get noticed once and you want to get context. Um, very easy to use it in flask. It's just, um, wrapping your application. Um, you can also use Raven client, uh, to capture messages on your own.
29:44
What do you get? Um, it's very important you get context. So if there is an exception somewhere, sentry catch it is you get the source code at the bottom where you see, okay, where did the exception happen? And you get also meter information. I think the most important one is which release was it and which server it
30:03
happened. And then you can have an action on this incident. So you could either just ignore it. That's pretty bad. But you could also flag this exception. Okay. We have fixed it in a new version. Uh, and SaaS, um, just see what's happening. And if you say it's fixed in a new version, you won't get a notice again.
30:21
So it's also very important for your daily business. So time's up. Uh, I'm done. Uh, just think about it the next time. If you are in a supermarket and see happy people shopping, new groceries, everything is fine. Uh, it's hard to get data science into production every day. Um, if you're interested in the slides, go to get up a blue yonder. Um,
30:42
I'll upload the slides and I hope to see you all at PyCon DE in October. Thank you very much, Peter. Um, who will think about Peter when shopping for bananas in the future?
31:02
Okay. Uh, questions. Oh, questions. All questions. Answers. Uh, I don't know if you really already answered it, but I'd like to ask again. So how do you deal, basically how do you deploy data scientists code basically? So it's like different notebooks and some crappy.
31:23
Yeah, that's not really how we deploy in a production. So you can bridge the gap between, uh, he, as you see, he's forced to provide a package on the, on the dev pi server and then he can say to a declarative API, okay, deploy me a service with this package and this version. And this is then used as a deployment on the server. Oh, I see. So,
31:44
so you deployed the package by the data scientist. Yes. Okay. Cool. Thank you. Oh, they're running out of time. So I'm sure Peter will be around if you just come to me and have a beer. Right. So thanks again, Peter.
32:01
Okay.