Unlocking the Power of What-If Analysis for BI, Data, and AI with Taipy
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 141 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/68758 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 202315 / 141
8
17
22
26
27
31
42
48
52
55
56
59
64
66
67
72
73
77
79
83
86
87
95
99
103
105
113
114
115
118
119
123
129
131
135
139
140
141
00:00
Power (physics)Mathematical analysisDigital filterType theoryMathematical analysisPower (physics)Library (computing)Filter <Stochastik>Group actionTotal S.A.NumberSet (mathematics)Computer programmingAnalytic setVector potentialFrustrationCASE <Informatik>Front and back endsPairwise comparisonPattern languageSoftware developerMereologyLevel (video gaming)InfinityPlotterMultiplication signMusical ensembleRevision controlSensitivity analysisMultilaterationVisualization (computer graphics)Selectivity (electronic)Computer animationLecture/Conference
05:14
Mathematical analysisConnected spaceSensitivity analysisInstance (computer science)outputChainPlanningFunction (mathematics)Set (mathematics)Variable (mathematics)Parameter (computer programming)Water vaporData managementExtension (kinesiology)Decision theorySoftwareMotif (narrative)Musical ensembleVirtual machineSpreadsheetProcess (computing)Line (geometry)CASE <Informatik>Type theoryComputer animation
07:53
Mathematical analysisMathematical analysisPower (physics)Sensitivity analysisMotif (narrative)FeedbackCASE <Informatik>Revision controlPlotterCross-correlationParameter (computer programming)Game controllerPairwise comparisonScaling (geometry)Data structureComputer animation
10:24
Online helpLine (geometry)Table (information)ResultantPower (physics)Software maintenanceOrder (biology)WritingSoftware developerDemo (music)Patch (Unix)Type theoryPressurePercolation theoryPlotterComputer animation
12:30
Router (computing)Power (physics)GoogolWeb pageMathematical analysisInsertion lossFile formatExtension (kinesiology)FrequencyDemo (music)Matrix (mathematics)Type theoryCartesian coordinate systemSineLibrary (computing)Mathematical analysisCASE <Informatik>Line (geometry)Parameter (computer programming)Hidden Markov modelMoment (mathematics)outputBuildingSign (mathematics)Functional (mathematics)Computer animation
14:20
Power (physics)Mathematical analysisInsertion lossFile formatSign (mathematics)Extension (kinesiology)Series (mathematics)Usability3COM <Marke>Type theoryPresentation of a groupProduct (business)SoftwareLibrary (computing)GradientPredictabilityData managementCASE <Informatik>Real numberEnterprise architectureCartesian coordinate systemWritingMathematical analysisCodeKey (cryptography)Visualization (computer graphics)BuildingDemo (music)Web applicationGroup actionGraphics tabletFront and back endsComputer animation
17:30
FrequencyGame theoryAvatar (2009 film)Visual systemCodeView (database)Category of beingGoogolPoint cloudComputer fileAngleCone penetration testWebsiteVertex (graph theory)Line (geometry)Standard deviationType theoryCodeDemo (music)Computer fileRow (database)Set (mathematics)Reading (process)WärmestrahlungConfiguration spaceProduct (business)Connectivity (graph theory)Task (computing)Plug-in (computing)Multiplication signMathematicsFunction (mathematics)Frame problemState of matterFunctional (mathematics)Variable (mathematics)Order (biology)Parameter (computer programming)Constraint (mathematics)MereologyDebuggerCASE <Informatik>Computer configurationoutputPoint (geometry)Cartesian coordinate systemMatrix (mathematics)Slide ruleFront and back endsSource codeComputer animation
21:49
Enterprise architectureOpen sourceCone penetration testCompilation albumSide channel attackInsertion lossFile formatExtension (kinesiology)Mathematical analysisPower (physics)Complete metric spaceCoding theoryStreaming mediaDemo (music)Point cloudPresentation of a groupMathematical analysisPlotterCodeLimit (category theory)Type theoryCASE <Informatik>BuildingMereologyDemo (music)Beta functionGraph (mathematics)Point (geometry)Slide ruleWindows RegistrySingle-precision floating-point formatNeuroinformatikPower (physics)Cycle (graph theory)SpacetimeRecurrence relationData miningCartesian coordinate systemMultitier architectureCurveProduct (business)Endliche ModelltheorieMultiplication signData managementRing (mathematics)Front and back endsScaling (geometry)Bookmark (World Wide Web)Sensitivity analysisLibrary (computing)Computer animation
Transcript: English(auto-generated)
00:04
Hello everyone, my name is JB. I am a technical leader in a data team and today I'm going to talk about unlocking the power of what-if analysis, which is a technique, with Type I. Type I is a new Python library which has been released last year and with Type
00:25
I you can build interactive dashboards. It is one of the few libraries featuring both front-end as well as backend features and we will come back to that later. I propose we go over, we discover what-if analysis in four parts. In the first part
00:46
that I entitled hard-coded filters, we will see anti-patterns, so stuff you should avoid to do and then from here we will iterate towards more and more advanced patterns until we use Type I at its full potential. Hard-coded filters, so I would like to start this talk
01:07
with a story. This story happened to me last year. I actually had to build a BI dashboard and this dashboard had to be built with Looker which is a BI dashboarding tool in
01:21
the Google Cloud ecosystem. The goal of this dashboard was to expose analytics KPIs for a SaaS, KPIs such as, for example, the number of sessions opened per user per day or the total duration of user sessions, stuff like this, so analytics.
01:44
And the dashboard was, the data looked like this. The data was stored in BigQuery and there was a huge pile of data which was clustered by customers so you would have a lot of data about customer one, a lot of data about customer two and the company had
02:00
a lot of customers. Users play an important role in our story because we, as developers, we always need to remember that what we build is for them first and this is what should lead the features that we are releasing. To build this dashboard, we first went with
02:26
the first mockup. This mockup worked like this. So first you had a customer selector, so you would select customer one, customer two, customer whatever and then you would have KPIs about this customer like this. You would also have another column with
02:44
the same KPIs aligned but this time for a customer group related to the first customer. So the data was aggregated at a group level and then you would have also other columns with other groups of customers related to this one.
03:04
When you think about user experience, there are several things that can be improved on this version but the one that I would like to highlight the most is that we are dealing with hard-coded use cases and by that I mean that those customer groups are forced
03:22
which might lead to user frustration because it will prevent them to, like you kind of forcing them to use the tool the way you wanted them to use and not allowing them to use the tool the way they would like have to prefer. And usually when programming
03:42
you want to avoid hard-coding stuff. So yeah, but it's a story so I hope you guys want a happy ending. So let's make this happen. Let's make this happen thanks to the scenario A versus scenario B approach. What the release dashboard looked like was something
04:05
like this. So first we have visual KPIs, cool, but most importantly what we have is two groups of customers that are dynamically set thanks to those filters. And this way
04:21
you can configure the orange group here, the blue group here and have a direct comparison between the two in your plots. So yeah, by opposition with hard-coded filters
04:41
here you have an infinite amount of use cases. You can compare one customer with a lot of customers, you can compare a group between a group and stuff like this. When we released this dashboard users were very happy because this pattern, this way of presenting stuff was actually very ergonomic because the users could dig into data in
05:06
a way they could not before and they could really compare customer performances. It's a cool story but the title of the talk was about what-if analysis. So what's
05:25
the connection? Well, I'm coming to it. But first I need to introduce the concept of scenarios. What is a scenario? A scenario is a set of input parameters. When you set those input parameters you get an output which is in our case a line. And when you
05:45
change input parameters you get another output, it's another scenario and so on and so on. It's simple. But on what-if analysis is the process of iterating this way by changing input parameters and trying to figure out what your data can reveal on your customers.
06:10
This actually might look simple but actually it is what made users happy when I think about it one year after. When I released this dashboard I was not aware of the concept
06:25
of what-if analysis, of the concept of scenarios but now I think that this is pretty much related to it. And if we want to go deeper on the definition of what-if analysis we
06:41
can ask Charge DPT for instance which we will have an extensive definition. In this definition what interests me is that the conclusion is that what-if analysis is a structured approach to exploring your data or your datasets. Then Charge DPT adds some keywords
07:01
such as scenario which is important, input variables, assessing the impact, decision makers gaining insights which is what makes users happy because they gain a lot of insights about your data this way. What do we have? So specialised software such as Type-I,
07:22
you can also use spreadsheets even if it's less ergonomic to simulate your scenarios. Then Charge DPT also gives us examples of some industries that can benefit from it such as financial planning or supply chain management but actually as long as you have
07:42
data you are able to perform what-if analysis on it and this includes machine learning if you consider your features as input parameters. So now we know what-if analysis is, how to unlock its power. So this is what I just
08:05
explained, this is performing what-if analysis. That on the other hand feels more like unlocking the power of it. Why? Because in this case we are stacking multiple scenarios on one plot so this allows us to compare things, to make more visual comparisons,
08:25
to get a more visual sense of our data. Just to make sure we understand the importance of stacking your scenarios on one plot, let's have a look at this case where we have isolated
08:42
scenarios. First, you can perform what-if analysis with isolated scenarios like this but what you will have to do if you want to compare your data is you have to open one tab and then another tab and set some parameters and switch between tabs so it's
09:01
really user centric. You will also have to work with different scales which can prevent you to have a visual sense of your data and this is especially important because usually BI tools set the scale automatically and sometimes you don't have control over it.
09:23
So it's not the best. What you will also have difficulties is figuring out correlations between your scenarios. Again, yes, you can do by switching between your tabs but it's
09:41
less optimal. This is not something that I am inventing. This is based on user feedback. I said they were not happy. They are less happy if you do this. They are still happy but yes, so they made feedback about that and this is why we ended up with this version.
10:02
This is what was deployed with a scenario A versus scenario B approach. I actually had the opportunity to reach those users two weeks ago and they are still using this dashboard so they didn't change the structure so they are still happy about it.
10:27
That being said, there is a trade-off. If you want to stack multiple scenarios on one plot like this, you will have difficulties implementing it with mainstream BI tools.
10:42
I had to implement it with Looker. I managed to do it but it felt like hacking the tool to achieve this result. Actually, I had to write hundreds of lines of SQL in order to be able to have this result thanks to the help of derived tables for those who don't know. In Looker, you are not supposed to write SQL, you are more supposed to write
11:05
LookML. This was not really easy to implement with the mainstream tool and the thing about it is that it puts pressure on the development team because maintenance will be hard to
11:23
achieve especially because this is hundreds of lines of SQL per KPI and we had 30 KPIs that were still getting added so it becomes a huge patch of SQL that you have to maintain.
11:44
What about Power BI? Can we achieve this in Power BI? I don't think you can but even if you could, you would have to hack the tool as well to achieve this result and you will have to maintain it the hard way. We are in this situation where on one side
12:02
we have users that are happy about a feature, developers that are less happy about the situation and I am here in the middle trying to figure out if there is something we can do. It would be so cool if you had a tool to be able to perform this and guess what?
12:21
That tool actually exists. It is named Type I and I have a small demo for you. Oh, here we go, Type I. I have a small demo for you. In this demo I will showcase Type I's
12:43
native scenarios because with Type I you can have native scenarios. This is the only library with which you can do this as far as I know. In this demo I also have two input parameters like in my drawings. So what I will do is I will add a scenario
13:03
like this. Details don't really matter. I just name the scenario and there I have a sign function which is quite simple but what is interesting is that if I do this, okay,
13:21
oh, I am getting insights about my data. Okay, so what if I add another scenario to compare? So I am adding another scenario. Okay. What if I change the amplitude? Oh, I see. I understand better the sign function now. Let's add a third scenario just to,
13:45
just for the fun. And that's it. We have three scenarios. What if I do that? I see. What if I do this? So yeah, that's me performing what if analysis with a simple sign function.
14:00
Of course this is a demo but I hope you can relate to your business cases. And this application, this Type I application is about 80 lines of Python which is quite short for the feature that it highlights. We will have a look at the application in a moment but first let's come back to our presentation. So thank you Type I. Introducing Type I. What
14:29
is Type I? So Type I, just to recap, is a Python powered library with which you can build interactive dashboards. The library is open source and if you are a company you
14:42
might be interested in enterprise paid features. The library features native scenarios with which you can build what if analysis use cases on top of it. You can also build other use cases but here we are focusing mainly on what if analysis. And again like this
15:03
is the only library that I know which allows you to do this. Another key aspect about Type I is that you are able to write MVP code and put that code to production fast. You can turn that MVP to production-grade software thanks to the library which is actually
15:29
a cool thing because Gartner reports that 85% of Python pilots actually fail to go to production. Who knew about that? Raise your hand if you knew about that. Okay.
15:44
One person. This is a taboo in our community. Nobody talks about it. But it's real and Type I is aware of that so this is why the library has been designed to offer
16:01
you the ability to write MVP applications and then it has also the features that allow you to turn them to production-grade codebase. How does the library achieve this? Well first what you build is web applications from the start featuring a low code syntax and
16:27
something I didn't mention yet is that Type I features user management because once you want to go to production you usually have use cases such as I want my users to see only that, I want other users to see this and to be able to perform that action. This
16:43
kind of use cases and this is what you will be able to do with Type I thanks to its user management features. Both native scenarios and user management are backend features which might sound odd at the first look when you think about a dashboarding
17:05
library but I hope you understand better the interest of having backend features and this actually is a competitive advantage of Type I compared to other data visualization libraries which you can find a lot these days. I would also like to show you the
17:29
code of the demo that I just showed you. So this demo, this is the code of that demo that you can see here. I hope it's big enough. What do we have? We have some backend code
17:45
which is not very interesting to describe here. This is mainly glue code which sets parameters to the appropriate scenarios. Okay cool. We have this state here which is specific to Type I. I will not be able to cover everything due to time constraints
18:04
but what you also have is scenario related functions. So get scenarios, set this value for this scenario and this kind of stuff. And then what you also have is this and
18:21
this part is the frontend part. When you're writing a frontend in Type I you basically have two options. You can write either HTML or markdown. I used markdown here because it was easier to get started. So this is native markdown and that's, actually it's
18:42
extended markdown because those tags are not standard markdown tags. Those are Type I specific tags. What this line means, with this line, only one line, you get this whole component here which allows you to manipulate your scenarios like create, update, stuff
19:02
like this, read. On this line how does it work? Well you have a selected scenario variable which is updated each time you select a new scenario. This is the name of the components and that is a simple callback. So when things happen in the components the
19:24
callback is called so that you can handle the changes. All Type I components work like this. So you have a variable with which you can get you the value. You have the component itself such as the slider here. And then a callback so that you can react to everything
19:44
happening on this component. There is something else. There is this configuration part. In order for this application to work we need to give it a configuration. We need
20:03
to tell Type I how our scenarios are wired under the hood. So this is what I do here by loading this toml configuration file which is this file. I'm just showing it to
20:24
you for the record but Type I's documentation recommends you not to update this file manually. Instead you have a nice VSCode plugin with which you can see the configuration in a visual
20:42
way. And what this basically means is that you have two input parameters that we map to our variables. We have a Python function which is mapped to this component here, this function, and then we have a data frame as an output. This here I will not cover but
21:02
it basically says that this is our entry point because this is a demo, this is a small use case. Of course when you write a production DAG then you will have a lot more data nodes as well as tasks. And yeah, so if you happen to use a DAG, you will have a lot more
21:29
use Type I in production. You might have variables such as I am using here as well as CSV files, SQL queries, data sets, and everything you can imagine as data.
21:46
So that's it for the code. Let's go back to the presentation. So this slide I copy
22:02
produced in other talks and the key idea of this graph is that Type I is in a sweet spot because you get a lot of features while having a low learning curve. On the other hand you have other popular libraries such as Plotly with which you can have a lot of features but the learning curve is quite high. And if you happen to use Streamlit
22:27
in production you might be aware that yeah, you can get started easily but then you don't scale, you are not able to scale in production and you cannot benefit from backend features such as native scenarios or user management. So let's recap. We have seen that hardcoding
22:48
your use cases might be not the way to go. Then we have seen that what-if analysis is more user centric that you can implement with tools like Looker, Power BI, Streamlit
23:01
which are popular. We have also seen the limits of having only one scenario per plot. This is where Type I comes to play and thanks to native scenarios we are able to implement this. We can implement stacking multiple scenarios on one plot way more easily. And this
23:25
is actually like stacking multiple scenarios on one plot is my favourite Type I feature that you can build upon native scenarios. Now that we know what Type I is, that we
23:41
understand more what it does, we can have a look at advanced Type I features and those features I would like to introduce with a concrete example. The example of McDonald's which uses Type I for one of their use cases. This use case is that every week each single
24:03
McDonald's point of sale must publish a forecast of revenues. How do we achieve this? Forecasting revenues, well we can do it thanks to what-if analysis, thanks to native scenarios. Then you might want to publish your executions so that everyone is aware
24:23
in the company and it's cool because Type I has a nice scenario registry feature that we have seen in the demo. And since we are dealing with a use case based on recurrence, so every week, every month, whatever you want, then Type I has a cycles feature that
24:45
allows you to take advantage of this recurring feature. Need to compute your scenarios. Of course, this McDonald's application is a living application,
25:02
it evolves over time so you might get interested in versioning your scenarios, your executions, because everybody is talking about versioning code, versioning models, but nobody is talking about versioning executions. Of course, all those features are completely optional.
25:24
You can only work with Type I on the frontend part or only with native scenarios if you want. I actually don't expect you to remember all of those by tomorrow, but the key idea, what you will remember is that once you have native scenarios, you can build features
25:43
on top of it and benefit from it thanks to Type I. So if this, if using native scenarios to perform native analysis is performing native analysis, combining all of these features all together for this use case feels more like
26:04
unlocking the power of it. So that's it. You can find the live demo, like the demo that I showed you on my computer, you can find it live on Type I cloud, which
26:23
is currently in beta, which will have three tiers to deploy Type I applications. The code that I used is available on GitHub here, and if, like me, you think that Type I has a role to play in the future of data, well, please add a star on GitHub. Thank
26:43
you. Thank you for your talk. We have around three minutes for some questions if you have. Just please check, use the microphones. I will put mine there. No one? Okay.
27:17
You can find Zan Batiste in code space or in Discord, so you can ask him there any
27:25
questions that you have. Thanks again. Great talk.