Data Warehouses and Multi-Dimensional Data Analysis
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 72 | |
Number of Parts | 94 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/30655 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
RailsConf 201572 / 94
1
4
7
8
9
10
11
13
14
16
17
19
21
24
25
29
30
33
34
35
36
37
39
40
42
47
48
49
50
51
53
54
55
58
59
61
62
64
65
66
67
68
70
71
77
79
81
82
85
86
88
92
94
00:00
Game theoryTwitterComputer animationLecture/Conference
00:37
TwitterComputer animation
01:34
Data analysisTable (information)CountingMathematicsUniqueness quantificationData storage deviceLevel (video gaming)Product (business)Cellular automatonRow (database)Query languageVirtual machineCASE <Informatik>Relational databaseSet (mathematics)Database transactionKey (cryptography)FamilyInformationFlow separationObservational studyReal numberMultiplication signCartesian coordinate system2 (number)Software testingResultantVertex (graph theory)Semiconductor memoryLatent heatMathematical optimizationSlide ruleProfil (magazine)Demo (music)DatabaseDifferent (Kate Ryan album)Endliche ModelltheorieOpen setHierarchyString (computer science)Order (biology)Dimensional analysisState of matterReading (process)Rule of inferenceSocial classAdditionCategory of beingProcess (computing)Cluster analysisSingle-precision floating-point formatRight angleParticle systemElectronic mailing listNumberCorrespondence (mathematics)Condition numberFrequencyStability theoryNoise (electronics)TrailInformation technology consultingIntegerSoftware developerField (computer science)Mass1 (number)DivisorObject (grammar)Total S.A.BitLogicPerspective (visual)Chemical equationPhysical lawSummierbarkeitComputer fileBlock (periodic table)Functional (mathematics)Thermodynamischer ProzessMeasurementService (economics)BenchmarkData modelOpen sourcePosition operatorType theoryThread (computing)Connected spaceWater vaporImplementationFile formatRepresentational state transferUser interfaceArithmetic meanTranslation (relic)Numbering schemeSquare numberJava appletCubeMaxima and minimaGoodness of fitLimit (category theory)Standard deviationConcurrency (computer science)Parallel portCache (computing)Line (geometry)Solid geometryGroup actionAbstractionTransformation (genetics)Term (mathematics)Disk read-and-write headOvalDigitizingGenderAsynchronous Transfer ModeNumeral (linguistics)ExpressionSheaf (mathematics)NeuroinformatikData warehouseStatement (computer science)Two-dimensional spaceProgramming languageNegative numberWeb pageGraphical user interfaceForm (programming)Projective planeProteinScaling (geometry)AreaCentralizer and normalizerFunction (mathematics)AuthorizationDynamical systemComplex (psychology)Library (computing)MappingCollisionCalculationData structurePlanningWell-formed formulaPhysical systemDefault (computer science)Traffic reportingMarginal distributionVolume (thermodynamics)Modal logicMetropolitan area networkServer (computing)MultiplicationWritingStructural loadInsertion lossOcean currentAttribute grammarParsingLattice (order)Existential quantificationVisualization (computer graphics)Data compressionConfiguration spaceMulti-core processorMobile appSQL Server 7.0Computer animation
Transcript: English(auto-generated)
00:13
I'm Raymond Simonovskis and I come from far, far away from Latvia. Where is Latvia?
00:22
Well, that was the question that was asked by many Americans three years ago during London Olympic Games when U.S. beach volleyball team was knocked out by a Latvian team and then there were many questions on Twitter, so these are Latvian guys, there were many
00:42
questions on Twitter, where is Latvia? Do they even have beaches? So I thought they had just vampires and castles and stuff, therefore I wanted to start with a short geography lesson. So at first, according to some movies, vampires live here, but Latvia is located across
01:05
the Atlantic Ocean in northeast Europe, you see there, well, according to some other movies, this vampire stuff originated in Transylvania, but that's more to the south of Europe, that's not us, but we have a 500 km long beach and if you didn't know it, so it's 310.686
01:29
miles. So we have a lot of beaches, so therefore beware of Latvians when we play beach volleyball. Okay, now back to our topic, date warehouses and multidimensional data analysis.
01:44
Imagine we are building a Rails application which will track product sales to our customers and we have several models in our Rails application like customers which have many orders, then
02:00
each order is placed on a particular date and can contain several order items, each order item contains the price and quantity of the product that was bought and the products also belong to product classes. So this is our simple Rails application and we went also to the last talk and learned
02:23
that we should use Postgres SQL, so therefore we designed and stored everything in Postgres and this is our database schema with customers, orders, order item, products and product classes tables and so we are proud Rails developer, we made our good looking Rails application.
02:42
Then one day our CEO called us and asked us a question, well, what were the total sales amounts in California in Q1 last year by product families?
03:00
Okay, we will find it out. So let's look at our database schema, so where do we store amounts? Okay, we have order items table, there we have amount column, okay, we should probably start with that one and well, we don't, we like Rails conventions, we will write everything in Ruby, so we start with order item, some amount.
03:22
Now, the next question is in California where we do have this geography, the geography we do have in customers table, therefore we need to join to order items orders table, join customers and add condition that customers country is USA, state is California.
03:40
Now, in Q1 2014, so where we do have this time information, it's order date in orders table, we have already orders table joined, we need to put condition, so we could translate this condition to that date is between 1st of January and 31st of March, but we would like to stick to the original criteria and therefore we will extract
04:02
the year from the date and extract the quarter from the date and we will use positive specific functions for that and we'll limit to that's 2004 1st quarter. And finally, we need to do it by product families, which means that now we need to join also products, product classes and then group by product family
04:22
and get some of that. So we finally got the answer, so it's probably not the shortest query in Rails and we can take a look that this was what we wrote in Ruby, so this is generated SQL, so probably we wrote it a little bit shorter, but not much, but we could also do it
04:43
directly in SQL and we presented the result to our CEO. Then he asked the next question, but also sales cost. Well, we could write a separate query, but this won't be so performant. Therefore, we'll modify our query. Unfortunately, in Rails relations, we can't make a sum
05:03
of several columns. Therefore, we need to write some tricky stuff. Yes, select explicitly product families and then some of sales amount sales of cost and then map it just to non-empty attributes. Okay, but then our CEO continues to ask questions and unique customers count. Okay, we
05:23
can add also count of distant customers ID and return that as well. But we start a little bit to worry that so these are kind of ad hoc questions and we will need each 15 minutes our CEO will call us and we'll need to write some new query. It would be better if we could somehow teach users to write these
05:43
queries by themselves. So we once tried it and so it explains how easy it is to write everything in Rails console and get the result. But unfortunately, our business users didn't understand that. Something's not quite good there. As well as our business is doing pretty
06:03
well and the amount of orders and order items is growing and we noticed that when we need to do some aggregated queries on large data volumes, for example, we tested, we copied some production database to our local computer and we got some 6 million lines and order items table. And if we didn't add any conditions
06:23
just wanted to aggregate sales amount, sales cost and number of unique customers, it took 25 seconds to do that. So which is not quite good for ad hoc queries. Well, then we asked some consultants what to do and then some consultants came and told that, well, SQL is bad.
06:43
You should use no SQL or introduce some Hadoop cluster and write MapReduce jobs in JavaScript, which will calculate everything you need. Well, probably also not that we still like SQL and so probably we shouldn't do that. But let's return to some classics and there are already 20
07:02
years ago, there was one, the first edition of this book was written at the Data Warehouse Toolkit by Ralph Kimball. So I would definitely recommend anyone. So it's now already third edition. Anyone interested in the topic. So to read this book about and this book talks about dimensional modeling and what are the main objectives of the dimensional modeling.
07:22
I'm quoting this book. So we need to deliver data that's understandable and usable to the business users as well as we need to deliver fast query performance. And how we do this dimensional modeling. So when doing dimensional modeling, so we need to identify. So which are these terms
07:44
that we see in these business questions, these analytical questions and model our data structures based on that. So let's look again at this question. What were the total sales amounts in California in Q1 2014 by product families? So the first thing we will always notice there will be some these so-called
08:03
facts or measures. These are some numeric measures that we like to aggregate by some other dimensions. And by other dimensions, which means which we can identify in these questions. So we have California, which is kind of customer or region dimension. Then we see some time dimension and we see some product
08:24
dimension or product family dimension. So when just talking with our business users, we can identify which are these facts, which are these dimensions that we need to use. And this data modeling techniques and date
08:42
model our so-called data warehouse where we'll store the data but organized according to these dimensions and facts that we see in these queries. And the typical database schema that is used for that is so-called star schema because most often
09:02
that we will see one table in the center and then a lot of tables with foreign keys linked to this central table and therefore it looks like a star. And these are these fact and dimension tables. So let's start from the center. This will be this fact table. So we are using naming convention that will use this prefix for that
09:22
for sales data. And always the fact table will contain foreign keys to other dimensions like customer ID, product ID, time ID and then the measures, numeric measures we like to analyze like sales quantity, amount and cost. And then it's linked to the dimension
09:42
tables. We'll use this naming convention. Start with D prefix for them. And we see that this is customer's dimension where we see all the customer's attributes. And then there are some special dimensions like time dimension. So instead of extracting some year or quarter dynamically during
10:03
our queries, we want to pre-calculate them. And therefore for each date that will appear for our sales facts, we will pre-calculate and store corresponding time dimension record. We'll have some time ID as well as pre-calculated which is this year, quarter, months both as integers as well as
10:23
strings which could be represented to users how we want to represent for example quarter name or month name, etc. And sometimes we don't have simple star schema. Sometimes we have this so-called snowflake schema that some dimensions like customer like products in our case are linked further to some
10:44
classes or categories dimension like product classes in this case. And if we have a lot of these ones other than our database schema starts to look like unique snowflake. And so we will store these star schema in a separate database schema
11:03
or it could be even separate database if we want to put for performance reasons. And how we would manage it from our Rails application. So we create corresponding Rails models on top of these fact and dimension tables. We would have sales,
11:23
customer dimension, time dimension, product dimension, etc. And as these are separate database schema we need to regularly populate this data warehouse schema with the data from our transactional data. And the simplest case would be that we just regularly repopulate the whole database schema
11:44
like truncate existing for example customers table and then do select from our transactional schema and insert all the necessary fields in our dimension table. Or in case of time dimension so we need to dynamically generate it. So we need to select what are all the unique
12:04
order dates in our case which appear and then we pre-calculate which year, quarter, month they belong and we store these pre-calculated values in our time dimension table. And finally we need to load the facts and in this case we
12:24
select the data from orders and order items table and extract the sales quantity, sales amount, sales cost and store corresponding foreign key values to our dimension tables. So one thing what we can see there that to simplify time dimension ID
12:44
generation we are using convention that will generate time ID as four year digits than two month digits and two date digits so that we always understand what the time ID refers to. Now if we return to the original question
13:00
and how we would solve that so now all our queries will be more standardized so that we always start from the sales fact table and then we join the corresponding necessary dimensions like customers, products, product classes, time and we specify conditions on the dimension tables that we want just USA,
13:21
California, year 2004, quarter one and group by product families and get the sum. So probably it wasn't a much shorter syntax but at least it is more standardized than we always know how to approach these analytical queries. But still we probably wouldn't teach
13:41
our users to write these queries directly and we are still limiting us to this two dimensional table model so we want to store everything in this standard two dimensional tables. But much better abstraction for these analytical queries is multidimensional data
14:01
model. So let's imagine that we have a multidimensional data cube so probably we can imagine three dimensions but let's imagine that you can imagine multidimensional data cubes with arbitrary amount of dimensions and then in these intersectional dimension values we store measures
14:21
which correspond to particular dimension values. Like in our case we have sales cube with customer, product and time dimensions and then intersection for each particular customer, product, time, period we store what was the sales quantity, sales amount, sales cost and
14:41
customers count for that. And there are technologies that at first in each dimension some dimensions might be just detailed list of values but some other dimensions could have hierarchies with several hierarchy levels. Like for example in customers
15:01
dimension case in addition to detailed customers level we have all customers together then we can expand them to individual countries then countries to states then states to cities and cities to individual customers or in case of time dimension we could have even several hierarchies maybe
15:22
sometimes we want to make the reporting we start by year, quarter, month and individual day and sometimes we want to make weekly reporting and then the same dates we can group together by weeks and then by years where they belong to and there are special technologies that are better suited
15:44
and which use this multidimensional data model and so they are typically called OLAP technologies where OLAP stands for online analytical processing vice versa traditional OLTP systems which are online transaction processing. So these technologies
16:04
concentrate more on how to do efficiently analytical queries and there are several commercial technologies for that but as well as open source technologies and one of the most popular open source OLAP engine is Mondrian engine
16:25
by Pentaho and that's a Java library and where you need to write XML so to define some data schemas well we rubies don't like Java and XML so much so therefore a couple of years ago I created a Mondrian OLAP
16:44
gem which is JRuby gem which embeds Mondrian OLAP Java engine and creates nice Ruby DSL around it so that you can use it from plain Ruby so let's introduce this Mondrian OLAP in our application
17:06
so the first thing that we need to define is this Mondrian schema where we do the mapping of these dimensions and measures that our users will use and which represent these business terms and we need to map them to the fact and dimension
17:24
tables and columns where the data are stored so let's look at example so we define sales cube and the sales cube will use this facts table F underscore sales then we have defined our dimension so we define we have customer dimension with this foreign key it will be
17:44
using the D customers table in the data warehouse schema and we specify which are all these levels that we want to use in this dimension and in which particular columns they are stored as well as we define product dimension and time dimension as well
18:04
and then finally we also describe which will be these measures that we will use in our schema like sales quantity sales amount sales cost which use some as aggregator but then we have customers count measure which will do the distinct count on customers ID
18:24
in a sales fact table to get the unique count of customers for particular query and there we use a different aggregator so and now when we look on the same question and how we could get the results using Mondrian OLAP so it's very simple and nice
18:43
so as we if we look at this so it's a minimum directly translates the question to our query we tell that from sales cube on columns we want to put as column heading we want to put sales amount on rows we want to put all product families we take
19:04
from product family level all members and we put this limitation filter that we want to filter just from customers dimension USA California and from time dimension take quarter quarter one 2014 and we get the result so we don't have any technical implementation
19:25
details which are hidden and created one once in this Mondrian schema definition and Mondrian engine as several others are you internally are using this MDX query language which is one of the most popular query languages for these OLAP tools which looks a little bit similar to
19:44
SQL but not quite and Mondrian OLAP JRuby Jamia does the translation from this query builder syntax to the MDX query language so which will be executed and as a result we get the results object where we can query and get so what are our
20:04
column headings what are our row headings and what are the cell values what we are getting there several other benefits of this Mondrian engine is that so when we will execute some large
20:22
MDX query where we do not do any filtering and again I tested it on six million rows in fact table so initial query also for large query will take some twenty one seconds but when we execute the same query second time it executes in ten milliseconds because Mondrian
20:41
engine does caching of the results in this multi dimensional data cube model and it doesn't do caching of these queries it caches the actual results that when we do the new query we it analyzes okay we have already these data cached in these cells these data cube
21:02
cells we don't have these ones for these ones I generate corresponding SQL statement to populate the data and as in these analytical solutions that we don't need very up to date information up to the latest second so we typically just regularly populate our data warehouse schema with the data and then while it's populated
21:22
so it can cache all the results and if many users are asking the same thing so results will be very fast additional benefits are that now we can much easier to introduce additional dimensions based on additional data attributes that we need for example in customers table we had a gender column which stored F or
21:42
M as values for female or male and we want to add to our schema additional gender dimension and we can easily create a new gender dimension map to customers table to gender column in addition so for users we want to decode that
22:02
F means female M means male and we can put this name expression which will be used for generating these names of these dimension numbers or we can and then we can use this dimension in the same way as we used in any others in our queries
22:23
in addition we can do even more advanced dynamic attribute dimensions for example we have a birth date for our customers and we would like to analyze sales by a customer age and split them into several intervals for example less than 20 years 20 to 30
22:42
years 30 to 40 etc but we have a birth date so we need to calculate it dynamically this we can also define a new age interval dimension where we once can specify this more complex expression so that we put there this SQL expression which will dynamically
23:03
calculate the difference between birth date and current date and then based on this interval it will output either less than 20 years etc as well as we dynamically generate the new dimension with these values and whenever we make the queries
23:22
it will be up to date based on what is the current time and finally one of the benefits of this modern engine is that we can make also calculation formulas
23:40
like we can make these calculated measures for example like profit which is sales amount minus sales cost or margin and percentage which is profit divided by sales amount and we can specify format string that it should use percentage formatting and as a result here we can query these calculated measures in the same way as stored measures and get the results back and also properly formatted
24:04
and in these MDX calculation formulas so there is almost everything what you can do in Excel so there is corresponding function in MDX as well so you can do a lot of more advanced calculations there and as a result there this data model allows
24:23
to create us also better user interfaces for doing ad hoc queries by users so this is as we don't want to always have to write these by themselves but then these objects what we are using are the same as customers are asking their questions and so
24:44
this is just example from the EZBI business intelligence application that we are building where we provide just graphical user interface where users can move okay we want this dimension on columns this dimension on rows filter by these dimensions and then view results in table or in charts and format it so this is the data model is much
25:04
better for doing these ad hoc queries okay let's switch to a couple of other topics so we discussed about how to do the queries but let's come back to ETL process so we talked about three letter acronyms SQL MDX so let's talk about another
25:24
three letter acronym ETL which means extract transform load in the simplest case is what we looked maybe we can populate our date warehouse just from the operational tables transactional tables in our database but quite often we need many different data sources for our data warehouse some are
25:44
stored in our transactional databases some are coming from some external sources as CSV files or from rest API and then this process how we extract this information from other sources then we need to transform them probably pass different formats data formats maybe
26:04
unify and standardize these data to use the same primary foreign keys etc this is this transformation step and finally we populate and load them into our date warehouse so there are several Ruby tools for doing this ETL so one was done
26:25
by Square and ETL gem and I want to mention there is one new gem Kiba for doing this ETL process which is oriented to the row based extraction transformation and loading so this is example from the read me there
26:45
you can make some reusable methods that you do some data parsing and then you define some source as a Ruby class and the source could be either CSV or database or something like that and then you can change several transformations and describe in this DSL how you would like to do the transformations and finally
27:05
you would like to load the data into the database one more thing I wanted to tell that if you do complex transformations then unfortunately Ruby is not the fastest programming language and if you need to process thousands or hundreds of thousands or
27:24
millions of rows it might be slow but sometimes if we still want to stick with Ruby so maybe we should do it in parallel and therefore I recommend to take for example a look at concurrent Ruby gem which provides several abstractions and one is which is very well suited for this multi-threaded ETL
27:44
is thread pool so thread pool is that we can create some fixed or varying size thread pool and we can push jobs to this thread pool and then when it completes so it gives some result and so then it's probably processed by the next thread pool and which might
28:04
suit very well this ETL process that we have some data extraction thread pool for example if you fetch some data from external REST APIs it is much faster if it's for example paginated REST API it is much faster start to fetch all the pages already in parallel and not fetch page one by
28:24
one for the first page after the next one etc it will be much faster in terms of total clock time to start already let's fetch in parallel first ten pages then the next ten pages so we can use thread pool there then if we need to do complex transformation of the data so then we can use the in parallel
28:45
threads these transformations but there is one pro tip so then please use JRuby as JRuby can use all your processor cores if you will try to do it in MRI so then unfortunately in MRI just one thread can run in parallel or you will need to then to make several processes which run in parallel
29:06
let's look at a very one simple example well we initially looked at single threaded ETL process where we selected unique dates from orders and then we inserted in our time dimension table and let's make it multi-threaded in this example that
29:24
initially we created an insert date pool with some fixed thread pool with default size four and then we select these all the unique dates but then we pose them to this thread pool and in this thread pool we will do the data insertion but please note also that in that case if you're doing multiple threads
29:44
please always do explicitly check out new connection from active record connection pool as otherwise if you take new database connections it will automatically fetch out the new database connections in new threads but if you will not give it back you will run out of database connections and finally
30:04
I will shut down and wait for termination and in this simple case I did some benchmarking also locally and then I managed to reduce twice the total clock time for loading the data but please also see and be aware that if you will start to increase this thread pool size even more you might start to get worse
30:24
results because in this case we still are finally inserting all the data in the same postgres table and then postgres might start to do some locking and slow down the process if we try to do insertions in the same table from too many parallel threads so please do benchmark so if you use JRuby there are good standard Java tools for that like
30:44
visual VM or VM or Java Mission Control and regarding JRuby you don't need to rewrite all your application in JRuby you can use it just for your data warehouse project where you populate the data and then do the queries and finally
31:01
I wanted to give short overview of traditional versus analytical relational databases so most of us when we are working with SQL databases so we think of these traditional databases which are optimized for
31:21
transaction processing like MySQL or Postgres or Microsoft SQL Server or Oracle and they can deal with large tables but they are optimized for doing small transactions like inserting updating and selecting small set of results but as we saw
31:41
then if we try to do aggregations of millions of records so they are not the best technology for that and there are different set of SQL relational databases but which are optimized for analytical processing for example there is
32:01
one of the pioneers where open source database MonetDB there are several commercial databases like HP Vertica or Infobrite but which have also community additions where you can use them up to some significant size of your data or with some limited features
32:21
as well as if you are using Amazon Web Services then Amazon provides Amazon Redshift database which is also this SQL analytical database but optimized for analytical queries and what is the main this magical trick
32:41
that these databases are using for these analytical queries so they mostly use different data storage how they store the data if we look at the traditional databases they mostly use row based storage which means if we have table
33:01
and row in a table then physically on the database so this all these columns from this row are stored together in one file blocks and when we need to for example to do a sum of some numeric amount as we saw let's do sum of sales amount
33:21
then it will need to read practically all our database table because so we need to pick a sales amount from here from here from here from here and therefore that's slow and so what most of these analytical databases are doing they are using columnar storage from logical perspective we are still
33:41
using them as tables with rows but the physical storage is organized by columns so like in this case so in the same example all values in the same column will be stored together the next set of columns will be stored together
34:02
and what's the main benefit now if we need to do sum of or count of some column from all records they are all stored together and we can read that much quicker the other benefit is that especially in this date warehouse
34:21
sales fact tables we have also a lot of repeating values for example these foreign keys or if we store directly already some values which are repeating some classifying information and then if they are all stored together they could be compressed much more effectively
34:41
and therefore these analytical databases also do better compression of the inserted data but the major drawback is that the individual transactions when using columnar storage will be much slower if you will now will want to insert these rows one by one in these
35:02
analytical databases so it will be much slower than in traditional transactional databases or if you update one by one therefore if you are using this columnar database storage databases analytical databases you typically maybe prepare your date what you would like to have there and then you do bulk import of the
35:22
whole table or the bulk import just on the changes which will be then much more efficient and I made a also simple example on my local machine as I said that I had generated the sales fact table with six million rows and I did
35:42
this query which just does the aggregation of sales amount sales cost and the distinct count of customer ID from more or six million rows and grouped by product families and on Postgres so whenever I run it so it was approximately 18 seconds on my
36:02
local machine then I in virtual machine installed HP Vertica and didn't do any specific optimization configuration there so the first query I ran so it took about nine seconds because well it just needed to load and cache the data memory but
36:22
each repeated query took just 1.5 seconds so with exactly the same date amount so I got 10 times faster performance so in reality probably you won't get the 10 times better performance all the time but in some studies of real customer data they
36:42
quite often report some three to five improvement on query speeds which are like these aggregation by and group by queries and I did the testing also on Amazon Redshift and got similar results to that with the same data set and
37:02
my very unsophisticated recommendation well unscientific recommendation I went to consider what so if you have less than million rows in fact tables so then you probably won't see any big difference so if you get 10 million so then complex queries will be
37:22
get slower on Postgres MySQL and if it will be 100 million so you won't be able to manage these aggregation queries in realistic time and so when you have already 10 million and more records in your fact table then for analytical queries you might need to consider
37:41
these analytical columnar databases so short recap what we did cover so problems with analytical queries using traditional approaches dimensional modeling star schemas modern all upon MDX ETL and the analytical columnar databases and thank you very much for
38:01
attention and you can see all these examples I posted on my profile there is a sales app demo application so you can find it there what I showed it and then later also all my slides will be published and thank you very much and I have some two minutes for questions still
38:21
thank you