Estimating the Performance of Predictive Models in R
109 views
Formal Metadata
Title 
Estimating the Performance of Predictive Models in R

Title of Series  
Author 

Contributors 

License 
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
Identifiers 

Publisher 
Vanschoren, Joaquin

Release Date 
2014

Language 
English

Content Metadata
Subject Area  
Abstract 
This talk will start with a very brief introduction to R and the main concepts of this data analysis environment and programming language. We will then shift focus to predictive tasks and models obtained from data to solve these tasks. Finally, the main topic of the talk will be on how to solve the critical issue of estimating the predictive performance of alternative models to solve some task. This estimation process is key to answer the question of which model is the "best" for a problem we are facing. We will describe the facilities provided by the R package performanceEstimation to address this model selection problem and provide some illustrative case studies. We wrap up with the ongoing plans of interfacing this package to OpenML.

Keywords 
R
Statistical software
Machine Learning
OpenML

00:00
Computer animation
Personal digital assistant
Direction (geometry)
Open set
Metropolitan area network
00:56
Data mining
Programming language
Computer animation
Multiplication sign
Archaeological field survey
Website
01:44
Computer animation
Causality
Multiplication sign
Open source
Integrated development environment
Set (mathematics)
Functional (mathematics)
Game theory
Bookmark (World Wide Web)
02:24
Area
Computer animation
Term (mathematics)
Function (mathematics)
Direction (geometry)
Open source
Website
Cartesian coordinate system
Functional (mathematics)
Physical system
Series (mathematics)
03:07
Point (geometry)
Type theory
Computer animation
Lecture/Conference
Line (geometry)
Physical law
Interactive television
Video game console
Online help
Key (cryptography)
Data type
Form (programming)
03:55
Mobile Web
Information
Complex (psychology)
Insertion loss
Set (mathematics)
Functional (mathematics)
Variable (mathematics)
Number
Mathematical model
Number
Computer animation
Causality
Meeting/Interview
Object (grammar)
Function (mathematics)
Information
Graph (mathematics)
Quicksort
Subtraction
Data type
04:43
Programming language
Boss Corporation
Content (media)
Set (mathematics)
Functional (mathematics)
Variable (mathematics)
Sign (mathematics)
Computer animation
Arithmetic mean
Lecture/Conference
Function (mathematics)
Quicksort
Extension (kinesiology)
Resultant
05:21
Frame problem
Projective plane
Table (information)
Table (information)
Maxima and minima
Data mining
Computer animation
Causality
Term (mathematics)
Data storage device
Quicksort
Subtraction
Data structure
06:08
Frame problem
Product (category theory)
Computer file
Client (computing)
Bit
Functional (mathematics)
Frame problem
Table (information)
Spreadsheet
Spreadsheet
Computer animation
Lecture/Conference
Database
Order (biology)
Mathematical singularity
Ideal (ethics)
Row (database)
Information
Data management
Data structure
Metropolitan area network
Physical system
Address space
06:45
Operations research
Frame problem
Source code
Standard deviation
NPhard
Sequel
Datenbankverwaltung
Mereology
Computer animation
Bit rate
Database
Order (biology)
System programming
Lipschitz continuity
Object (grammar)
Quicksort
Freeware
Library (computing)
07:43
Operations research
Standard deviation
Source code
Frame problem
Metropolitan area network
Digital filter
Theory of relativity
Momentum
Set (mathematics)
Group theory
Functional (mathematics)
Rule of inference
Demoscene
Local Group
Flow separation
Computer animation
Personal digital assistant
Function (mathematics)
Database
Operator (mathematics)
Perpetual motion
Physical system
08:51
Digital filter
Torus
Service (economics)
Range (statistics)
Set (mathematics)
Sequence
Functional (mathematics)
CAN bus
Computer animation
Commitment scheme
Personal digital assistant
Order (biology)
Quicksort
Data management
Library (computing)
Physical system
Condition number
09:55
Digital filter
Machine vision
Physical law
Color management
Term (mathematics)
Mereology
Subgroup
Local Group
Exploratory data analysis
Computer animation
Visualization (computer graphics)
Term (mathematics)
Uniform resource name
Operator (mathematics)
Visualization (computer graphics)
Computational science
Statistics
Quicksort
Formal grammar
11:11
Point (geometry)
Torus
Mapping
Length
Term (mathematics)
Category of being
Computer animation
Personal digital assistant
Visualization (computer graphics)
International Date Line
Computational science
Statistics
Quicksort
Information security
Library (computing)
Metropolitan area network
Formal grammar
12:30
Complex (psychology)
NPhard
View (database)
Magnetooptical drive
Amsterdam Ordnance Datum
Maxima and minima
Exploratory data analysis
Computer animation
Meeting/Interview
Personal digital assistant
Lipschitz continuity
Representation (politics)
Quicksort
13:12
Point (geometry)
Support vector machine
Algorithm
Multiplication sign
Scientific modelling
Insertion loss
Parameter (computer programming)
Generalized linear model
Bookmark (World Wide Web)
Number
Prediction
Term (mathematics)
Library (computing)
Series (mathematics)
Programming language
Structural load
Gradient
Bit
Functional (mathematics)
Frame problem
Mathematical model
Number
Computer animation
Lie group
Hill differential equation
Data type
15:44
Point (geometry)
Standard deviation
Multiplication
Presentation of a group
Algorithm
Numerical digit
Computer program
Mathematical analysis
Computer
Data analysis
Mereology
Computer programming
Bookmark (World Wide Web)
Exergie
Number
Video game
Computer animation
Causality
Computer cluster
Lecture/Conference
Aerodynamics
Game theory
Traffic reporting
16:51
Standard deviation
Boss Corporation
Code
Multiplication sign
Computer file
Computer program
Interior (topology)
Computer
Code
Dynamical system
Data analysis
Computer programming
Computer animation
Lecture/Conference
Aerodynamics
Graph (mathematics)
Data management
Resultant
Traffic reporting
18:05
Web page
Slide rule
Computer file
Code
Set (mathematics)
Graph (mathematics)
Demo (music)
Dynamical system
Mathematical analysis
Angle
Plot (narrative)
Emulation
Exploratory data analysis
Type theory
Moving average
Statistics
Species
Computer worm
Conditionalaccess module
Library (computing)
Dean number
Newton's law of universal gravitation
Area
Point (geometry)
Length
Code
Iterated function system
Local Group
Singleprecision floatingpoint format
Computer animation
Function (mathematics)
IRIST
Graph (mathematics)
Integer
Reading (process)
Resultant
19:30
Set (mathematics)
Graph (mathematics)
Computer file
Length
Mathematical analysis
Mereology
Perspective (visual)
Exploratory data analysis
Mathematics
Type theory
Computer animation
Lecture/Conference
Function (mathematics)
Species
Statistics
Game theory
Integer
Traffic reporting
Library (computing)
20:06
Point (geometry)
Mathematics
Lecture/Conference
Memory management
Aerodynamics
Quicksort
Whiteboard
Resultant
20:54
Computer file
Lecture/Conference
Resultant
World Wide Web Consortium
21:45
Royal Navy
Rule of inference
Goodness of fit
Mathematics
Computer animation
Set (mathematics)
Uniform resource name
Direction (geometry)
IRIST
Dynamical system
Quicksort
Conditionalaccess module
22:24
Mobile app
Server (computing)
Computer
Code
Multiplication sign
Computer file
GUI widget
Functional (mathematics)
Medical imaging
Computer animation
Personal digital assistant
Telecommunication
Data mining
Moving average
23:18
Simulation
Momentum
Mapping
Bit
Functional (mathematics)
Open set
Estimator
Wave packet
Mathematical model
Data mining
Prediction
Computer animation
Lecture/Conference
Personal digital assistant
Quicksort
24:36
Mobile Web
Boss Corporation
Context awareness
Mathematics
Computer animation
Lecture/Conference
MiniDisc
Set (mathematics)
Units of measurement
Bookmark (World Wide Web)
Substitute good
Estimator
25:15
Standard deviation
Multiplication sign
Prediction
Estimator
Wave packet
Revision control
Flow separation
Word
Sample (statistics)
Computer animation
Causality
Lecture/Conference
Personal digital assistant
Software testing
Game theory
Data type
Error message
Resultant
Physical system
26:57
Theory of relativity
Computer animation
Bit rate
Lecture/Conference
Multiplication sign
Disk readandwrite head
Depiction
Wave packet
Physical system
Estimator
28:12
Computer animation
Lecture/Conference
Multiplication sign
Decision theory
Quicksort
Infinity
Subtraction
Functional (mathematics)
Metropolitan area network
Row (database)
29:09
Musical ensemble
Computer animation
Information
Lecture/Conference
Functional (mathematics)
Estimator
29:46
Linear regression
Support vector machine
Decision theory
Scientific modelling
Distribution (mathematics)
Plot (narrative)
Maxima and minima
Pointer (computer programming)
Lecture/Conference
Library (computing)
Physical system
Task (computing)
Computer chess
Theory of relativity
Linear regression
Set (mathematics)
Functional (mathematics)
Arithmetic mean
Error message
Computer animation
Estimation
Personal digital assistant
Object (grammar)
Task (computing)
Resultant
Protein folding
31:13
State of matter
Connectivity (graph theory)
Open source
Parameter (computer programming)
Prediction
Functional (mathematics)
Frame problem
Variable (mathematics)
4 (number)
Social class
Prediction
Video game
Wellformed formula
Computer animation
Lecture/Conference
Object (grammar)
Function (mathematics)
Software testing
Information
Task (computing)
Social class
Task (computing)
32:20
State observer
State of matter
Multiplication sign
Parameter (computer programming)
Disk readandwrite head
Mereology
Functional (mathematics)
Social class
Prediction
Computer animation
Function (mathematics)
Object (grammar)
Forest
Physical law
Task (computing)
Resultant
Perimeter
Social class
33:26
Standard deviation
Algorithm
Constructor (objectoriented programming)
Scientific modelling
View (database)
Electronic mailing list
Metric tensor
Data model
Prediction
Interpreter (computing)
Forest
Software testing
Testmenge
Descriptive statistics
Exception handling
Default (computer science)
Run time (program lifecycle phase)
Multiplication sign
Standard deviation
Process (computing)
Structural load
Parameter (computer programming)
Staff (military)
Prediction
Set (mathematics)
Control flow
Functional (mathematics)
Metric tensor
Flow separation
Computer animation
Function (mathematics)
Digital Equipment Corporation
Task (computing)
Tunis
34:52
Gradient
Mountain pass
Electronic mailing list
Parameter (computer programming)
Metric tensor
Data model
Prediction
Lecture/Conference
Interpreter (computing)
Addressing mode
Perimeter
Default (computer science)
Area
Default (computer science)
Standard deviation
Product (category theory)
Linear regression
Parameter (computer programming)
Prediction
Control flow
Functional (mathematics)
System call
Metric tensor
Arithmetic mean
Computer animation
Function (mathematics)
Data type
36:00
Multiplication sign
Standard deviation
Domain name
Default (computer science)
Constraint (mathematics)
Algorithm
Constraint (mathematics)
Parameter (computer programming)
Functional (mathematics)
Metric tensor
Data model
Prediction
Wellformed formula
Function (mathematics)
Object (grammar)
Vector space
Software testing
Testmenge
Task (computing)
Matching (graph theory)
36:39
Decision tree learning
Regression analysis
Constraint (mathematics)
Parameter (computer programming)
Volume (thermodynamics)
Set (mathematics)
Prediction
Functional (mathematics)
Metric tensor
Workload
Prediction
Wellformed formula
Computer animation
Vector space
Arithmetic mean
Function (mathematics)
Object (grammar)
Vector space
Dependent and independent variables
Lipschitz continuity
Software testing
Software testing
Task (computing)
38:07
Scientific modelling
Multiplication sign
Parameter (computer programming)
Weight
Bookmark (World Wide Web)
Web 2.0
Prediction
Equation of state
Dependent and independent variables
Ranking
Pairwise comparison
Perimeter
Decision tree learning
Simulation
Electric generator
Linear regression
Parameter (computer programming)
Prediction
Instance (computer science)
Functional (mathematics)
Flow separation
Arithmetic mean
Telecommunication
Order (biology)
Quicksort
Arithmetic progression
Task (computing)
Resultant
Octahedron
Random number
Network operating system
Electronic mailing list
Event horizon
Plot (narrative)
Wave packet
Goodness of fit
Subtraction
Window
Mobile Web
Run time (program lifecycle phase)
Default (computer science)
Pairwise comparison
Addition
Hamiltonian (quantum mechanics)
Standard deviation
Information
Estimator
Computer animation
Estimation
Personal digital assistant
Function (mathematics)
Blog
Series (mathematics)
42:31
Confidence interval
Multiplication sign
Scientific modelling
Decision theory
Parameter (computer programming)
Sign (mathematics)
Mathematics
Singleprecision floatingpoint format
Statistics
Pairwise comparison
Perimeter
Physical system
Product (category theory)
Linear regression
Parameter (computer programming)
Term (mathematics)
Functional (mathematics)
Hypothesis
Metric tensor
Degree (graph theory)
Flow separation
Message passing
Exterior algebra
Vector space
Software testing
Whiteboard
Quicksort
Freeware
Resultant
Statistics
Set (mathematics)
Electronic mailing list
Thresholding (image processing)
Plot (narrative)
Number
Causality
Average
Utility software
Software testing
Subtraction
Metropolitan area network
Task (computing)
Pairwise comparison
Information
Expression
Physical law
Confidence interval
Independence (probability theory)
Line (geometry)
Set (mathematics)
Thresholding (image processing)
Table (information)
Fermat's Last Theorem
Computer animation
Personal digital assistant
Film editing
46:33
Right angle
Table (information)
00:01
they have featured in the pictures of the 1st lady of the lot and there were some who were do not just a case think you and so on dystopias I'm talking was mentioning would be essentially
00:25
about a relatively recent decades that developed for them the issue of estimating the protected from of mobile enough and that which of crosses related with the goals of open amount but and for the working man and asked me to make it to do over a shocking direction to are because of this I assume that the people here were not familiar with a and that was stupid enough to accept the challenge which basically in possible so anyway tried to
00:58
a and delivered in a way not really enjoyed to allow the kind of show off of something that I can do enough and the same time I know giving some illustration of using peso for those at the
01:13
familiar with the embers storey about this boring stuff that you here again but anyway that might my best up at the end of the 20 minutes into 2 2 are so on but the cleaner is a tool of which is a Programming Language and an environment for but not that is a no no and as being currently 1 of the most used tools with for data mining and this according to a recent survey is a site that is a nice
01:45
feature being free an open so which off his related possible sought suffered also with symbols of open a mountains reproducibility and that
01:57
it has analyst only but both on academia but also in industry and so well for me it's about my favourite tool of causal and only with things to say about the adult that you will some all get some of these are you man's from this very shocking to sell its at all the available for most of the time to do that for a time with the business of Asian comes or already with an impressive set of functionality but they feel you can extend is true system of
02:26
ex about it is that now there are more than 5 thousand currently available so are is really very broad in terms of application area is that you can you can use to get so other mention this packages and 1 of them is
02:41
the open amount that will be affected show that and if it is to open mouth but these executed provided some extra from 2 new functions and eventually data that people can use to some very some more specific purpose so on this involves of Boston spelling and and loading this package to to be able to use it to get so that the basic and direction
03:08
regarded as the 1st to eventual cultural show that most people will face which is that the common law the usual command line which people I know nowadays get a kind of upset because they are used to all this point and Greek buttons and that many of the said that the the 1st thing that people are get shock when they see ideas were were on the menu but basically comes believe made to your help because it in which means that you need to to do to know what you are doing which is a good thing and so essentially you'll you'll
03:41
you type Coleman so is a kind of interactive into face where that command and then you get as a return to get so I get of causal so it's also set the Commons or cricket you
03:55
want and dual dissing sequentially and so
04:00
on but the cause of the not so you think Violet can store from simple things like numbers to to complex Mobile's to about so every single ways of viable that you can stop and you can start a different type of information like an old numbers and nasty and about for every can stauding is seen this sort of all gipsies different types of autocrine up and bobbles another bought them the posted to not is the notion of function because everything are basically the function so are at a loss as loss of built in front like for his creating a set of numbers applying some function to be set of numbers or and sometimes you know this is a
04:43
also bought the notion of victories Asian where you apply function to a set of things and I get the said the results of all the basic notions of the are language and you get used to it was so early so get boss extend language by creating your
04:59
own fountains and that's another thing that I can assigned to a viable so you sagely a sign that the content of the function to a viable and after that you can use a reuse it later on to reach and that's the end of the sort of thing that would 1 day when we create the packages that you could recreating the new that the function for his function to applaud your to open a mile
05:21
or compiled for most of the basic this sort of thing that you tend to to do to get so the 2 friends the central of all detained are in terms of data mining which is what we are essentially talking about
05:38
this in the open amount so essentially store by them officials to the store data tables on a knockdown gold data for the peso from now on this I know that can be that the tried to a goal from this data friends and straight to the illustrations difference that later mining project of cost very briefly so the 1st thing of cause that you need to load to do is to put your Daytona data want to
06:09
use of which are easily can do that in many different ways you may eventually have a tax policy to file a new we'll for said functions in order to reduce the thing to do a bit of frame and that you can eventually have your data on excels spreadsheet the cost you also and no functions read is that these data from a spreadsheet into a data frame because the mentioned the
06:34
central singing are a side with the French before they can and they can also eventually even at your data on ideal of fantasy database man management system for far as a lot of packages tool in the face
06:47
for his my sequel and then you know justified Touristik will freeze and get your data pulled down from from the database in your data from the Bank so that the sort of thing that you can do it in order to get your dating to these objective is data friends which are used by the by most of the half judicate and then of cost the next step was to to miss you get your dating to are is to get back to you to do the stand exploratory another your banque debt which typically bowls careering and summarising some prop exhibited at the and that
07:26
can be done a in many different ways enough but 1 very nice swathed in my opinion is the player for a package which is bycatch kind of devoted to the issue of free rating and summarising data may be sought I'd on
07:44
data friends but running on the new law on on the clock although I 1 database mind humans the case of these function as several inches if he features like a implementing the most basic the data manipulation of relations and also being a able to and all of the different and the sauce is a place where you can have your already on the database but you can actually work with the player or data which is taught on database or even on the face of it it's a very very good including those at the scene in the demise of late in the day data sauce from the operation of the data manipulation
08:21
of British now I'd say essentially has very basic set the functions at being away in the late what he would on a database momentum system to a functions for children the roles for selecting the columns Fourier reorder a new rules for adding eventually new column for summarising your data and then for creating groups also groups of your so that this is the stuff that you usually are a pressing 1
08:51
but the business commitment and systems that you get in are through these player the idea you know few examples that just the 1st of the stock where you you essentially held by worries your data in this case is on the standard different but he has a major could be on the Tube is a management system that the 1st thing that they will might the sauce
09:14
is thick service to identify with the sauce then you can apply the is a summarisation function so far as I can fill the your data by a set of 4 logical conditions and then in a way you are doing the same things that you do want to go but you are doing them directly in his early can for reasons why through this kind of chaining operate the you can apply if sequence of a Data management of Britain's like is and then selecting this column's and then a range of booting descending order of something so you do these sort of queering that you're custom to do on is that you do that directly enough but they can also
09:57
create a new column senior data that it eventually summarises new Fulham's you can create cost subgroups and then applies summaries these groups so that all this sort of the no data summarisation data on Monday Polat and the operation is get very easily done through the is very small set of over that this package supplies to you under the Zamora over his very
10:25
think of competition efficiencies so if you are into a exploring data with are old strongly recommend that you try to look at this as part of a package of tax than the cost again with India's exploratory on the back of your neck is usually also good idea to try to get some of these Urbas visions of your data for debt probably say that the law is at least 1 of the oldest Beckett is in terms of the data visualization in are so that not by insisted it developed by the same developed the by anyway and it's a very good package for creating nice and visualization in are my statistical so it that some
11:12
very nice to read the behind and you may leave eventually 1 2 0 2 2 browse through these reference so large as to be beyond idea so essentially what did but does is that it is nice concept of mapping the properties of your data which essentially the vials that
11:31
your data describes the ball and it is mapping this things into a properties of the disco again so I'm the for the 1st stage
11:41
you essentially make this mapping is a aesthetic mapping security fewer that the job and that the use of City you're saying that well my data as the property that is going to be mad to do these property of about which is the expat in these are the property of my day is going to be made into these other property which is why and then there another brought but it might be that it is going to be met by to call a which is another property of the Crown to the sort of a man being sick and then a you kind of over bought some Joe magical all G saying to sing in this case I'm looking points on using this the and so that's the kind of general a concept of Graps of cost you can do you know of a much more
12:31
complex of G blood that justino brief
12:37
overview of the sort of things at the weekend even though build on top of that you can for present you have like spatiotemporal data you can use these unaffected builds on the beauty of what you mad and you can you build this fancy
12:53
spatial Temple representations the case of that again my 2 minutes and shot on the with
13:01
this year and has just been of just the views some point as on it as an interesting that it is in my personal opinion for doing this kind of exploratory and then was
13:13
mauling is also in almost existing moulding techniques are available now not know just very brief over Bula but brutally the most interesting aspect series for you to do so Cedatos data face with these different than techniques is essentially a very similar so you you're
13:32
most of the time independently of the technique
13:34
that you're playing like to believe him at all my Thora gradient was the missing typically you have for the 1st idea of what is known in are as a for
13:47
Miller which essentially is language for specifying functional dependency of the British so you he tell Patel are what is the target viable and then what type of the suspect and the particular all the remaining by allows you to put a yield is not representing everything else is to be taken as the and that kind of very generally terms of all the good that all the police most of the more link functions ability and then the 2nd permitted typically had use the data to a new high point to use for for for opening or more and then eventually some techniques in old take some other Prometheus that for you to juniority or more of it but that's the but the general point so you have no specific functions for all day in different created novels some of them come on different Beckett is that implemented is moguls and then you go out you Load this packages and then you call the functions to open the baubles unassigned is to buy a book that will storey or more so that the generally of Beatles obtaining a predicted model in are so you have to to find out what is the function implement your favourite to win eventually you have to find a package to implement functions and and you need to know to to lot to use it but typically the 1st oddment will be a for the 2nd time will be a bit frame and and eventually any parameters accepted accepted by these by days to get so that and that the kind of thing that you doing them the predicted mauling without loss you can also do you want to provide learning like freezes close a again allied number of costing its novel like for his goal camions liked the idea no fuzzy Clostridia tactical closing the to largest 2 examples of most of these things are implemented already now
15:44
so you can play with them if you want and again these are just some point to some Beckett it is at implementing the common although tombs a enough then of cause another game for us but of any later mind but it is the part of reporting which is that we know what it is that that the deal you need to communicate of things to other people that life and this
16:13
typically involves the No 7 different tools which is a problem because the gulf multi people use their favourite data mind to like the trial pizza are Somerleyton and then but then they did a going to their favourite to work assessing Laureano presentations sulphur Areva and then they go on to a kind of mad cow the place between the 2 tools with a lot of mental and use work by a of trying to produce some decks around your analysis to communicate with people can that set of cost where prone to air that not only did it is
16:52
but it's also prone to air as and that the idea of these Dynamic documents which was announcing that would like to to deliver straight with you which is not of across an idea only existing are and the gold goes back in a way for far topic in time but it so implemented
17:10
through this practise but the which I know at a allows you to create a dynamic documents that makes your solution your mouth this which your Commons during 14 and so what you do what you do is that on a single document you write your storey and Dorian on the new show your results so instead of having these 2 separate tools your installed and your I walking Dolan Daniel Emmanuel legal through these but the facing saying that you can't do that on a single block so that the idea of these Dynamic documents you this single documentary makes your storey with your code that implements or mountains and then you buy that to are need the and then you get your final re bought and that you should we are your manager for ever lost a provides a were so that the big oil 1 of the bigger because of and this is that it is that if the boss and he
18:06
doesn't like your abroad to change anything typically you just changed some slide single here and you just go and produced with about them the read for the pages just to give you a small illustration of what I'm talking about the old days saying here it's 1 example of the of Benedict upmystreet see that idea text all with any and I know any particular for mapping the details although that I know that some some kind of takes for pudding something on boulders you're going to speak and then at use Iyad is a great area which are close known as coach and so that seems that usually people will go on their data my England and will be based result on the document but you can't do actually everything on a single document and then you get this document to are unfreezes suppose if you want us as helpful to the city and its general file so you just breast this old are goes on and
19:14
compiled a welcome and produces the group or the the and sold at about what Dynamic document looks like so you see I of cost you can only showed the results of that and he or you can make the code and the results are so you can produce a your it bought
19:31
dynamic like that but I'm a few game is not a Gammell you can
19:35
it or if you prefer a the and then you just change will sort just
19:42
wrong but just changes into the and then you get to know fully the Diaz on your PC so you get your pay or the and if you want more than you can also get to work and that you can also no produced lights that's a mouse like that so you the part of reporting can be applied integrated
20:06
with the to which is a way of voiding many mistakes and by the way for his for research broad it's a few have like some in the House skull a sheep some person and that the early will leave away no sooner or later that it's good idea that you you ask them to do the with board like that because you you you stay in the end you have not only the results but also the way that the results were so that the several using sort of the peso or not only sometimes couple who will we need to rebuild but sometimes we need to deploy
20:43
or data money results to we clients or and that also and Nice tools of is a not 40 point is that the mine results which would be enough the end of the change you want
20:56
and how 1 of them is that they could which allows you to very
21:01
create went out directly from from the and so you can create very easily graphical use and the fact that runs on or more brawls that you really don't need to take care about to go up a self to a cross from compatibilities the issues like that because you have
21:20
a standard brawls as the only require and so essentially shiny went out a strong by 2 files 1 that takes care of the users to face and the other 1 that takes care of the of the competition behind to produce the results on the way out and that actually very simple so if you want that the very simple example the fire and a
21:46
saint a get something like that of that run along brawl the and and I think directive in the sense is of Kosovo and navy desirable but you have to know what we did but we just that some all change helped come to see him so you can use this shiny without about to create something Dynamic which on the background opera using your all put but then you can create your fancy a using the 1st not actually actually very good example example but you can create
22:16
a this sort of a way out and the was then you can provide is your own so your own
22:25
server or you can leave even You shiny as some of us some posting facilities and that's a very easy on the city as a major to file like the using the face at the health and it has some the month that controlling the layout of the face and then the server that actually are computation that provides is that in this case at the same time the books but there is so
22:51
bad that some data such as providing the
22:54
communication between the 2 of the 2 firms would essentially it's very easy to produce something that runnable or Schäuble to some someone we do know very shocked amount of code because the Chinese provides or a lot of functions create use the which and stuff on image a mouthful so that a mighty but it was far shocked introduction
23:21
to restriction more of different but mining said that you can carry out a not so much the momenta talk then there is the issue of the estimating and comparing the comparing the
23:36
Performance of review moguls and which is cost a related with 2 goals of open Amalle and just wanted to to to talk a bit about the case that the developed a one off fee of goals and the of 4 for doing things so is so just for
23:54
us to do you a of of the Commons at the Solway sensually we have predicted last who which I'm a ways and and known function that maps of those of us at the biggest into a tide verbal and typically this is either a nominal or an American verbal that is this occasion or aggression problem and the then we are given the train data and and typically a we decided on some before the simulation criteria and then the goal of this sort of experiments is about to obtain a reliable estimate of the value of these performances criteria on the back of these data
24:38
using some critical toolkit so that the general goal of a profound system Asian and the context of the dust and then the 1 way
24:48
about 1 wrong way of doing that that sometimes people of people use is to use this disc and reasons for substitution estimates where you know the given is that the set reopened hour favourite mobile and then we played on the same date and collected the over for mathematics but we get on to using dismal that the boss is a kind of unreliable because these estimates are tend to be of
25:15
overoptimistic because I know if your more believes the good at approximating that the data that you gave it to them end of cost the result will be a very good of got that depends a lot on the type of technique that you are using my and sold to the seriousness of these and reliability depends on the technique but generally is a bad idea to to proceed used to so the main goal than of these before Miss is mission is that these I should realise beauty of estimating the expected Prediction error and on and on but this version using only the data that we have and for that to be possible typically a we want to test or on a
25:59
separate said the best cases and which you usually called the that instead of buying directly to to the data that was used to pay them back so about many ways of a cost of doing this and moreover which typically tend to repeat this training and testing several times to agree is or this discussing the because of Morris demands and typically then we averaged is cause word is repetitions together with the euro some standard the systematic and that the game that we played here on for for from system should get out now down many ways to of many Methods if you want a prestigious due due to open this autumn reliable estimates 1 of the simplest is that these idea of old also with the clear randomly split the available data into said 1 for playing the ball and the other 1 4 for testing and that the cost of
27:01
produces some single of school or if you want to and then they frequently what we do is that we will repeat his run leaking several times and then we collected estate tricks that we have written and public of for the depiction of a slightly different approach is idea of people
27:20
close relations where you start your you randomly Bermuda data and then use the this dating to carry told of Oracle size pulled and then you will be to rate Kate times leading 1 of them aside the best that the train your mobile on the remaining and became mind was 1 and that it on day and on the left out of pocket Ugandan and then for each of the estate repetitions you get to score and then in the end you averages and get your head of cross with nation estimate of the film is so that's just 2 examples of estimation met though is that the cost many more around and the goal of this before the system mission are package is exactly to try to facilitate the No during
28:13
all this sort of experiments on different different met using different made for a place in the record books to suppress so that the infrastructure provided by the
28:25
baggage can be applied to any mobile and time and any ventilation magic of his that the kind of man design goal is still to be completely general inflexible to adapt to early notably uses approach to anybody to stop there and so are the main function of the banque which has the same name of the banque it is called before Masisa mention that the decision 3 are given to get the 1st 1 is the said predicted that you want to use on or your experiment sometimes only 1 but you know when you are writing papers intend to evaluate on would not many so you can provide on the 1st ogam on the set of British from and going to the 3rd to in while
29:11
what is a pretty big band the 2nd is set of workflows or if you prefer a solution for this thought
29:18
that you want to go there for the volume of get so that the 2nd half and the and the odd 1 is experimental methodology that you want to use old all across the nation was shut for
29:30
the day so that the 3 main pieces of information that the function takes a lot out of the work the solution that you want to compel a on Tuesday on which excrementally you want to use tool playing a reliable estimates of their for for months on the back of the now at a mention the
29:49
fountain implement a wide range of extra mental about his including all Dalton persuasion but also are those
29:55
up will mention now you just very simple example so
30:00
suppose I want to make the means that there are the and or certain a regression by using them focus relations you could yield is just decisions stuff to get the best Yemen the data but essential that the function Culloty so you call the function out with a pretty good task and essentially task is to find way the target to about the pretty what is that the set and then and you will you have work full which is your solution and all go into the detail of what and that the decision a work full so that a solution to the stand and the and you say that you want to use grossly they should weekend for in and that so you practising you eat and then you get your a results this all day which is the result of this opera from a system mission experiment and then I know I can obtain tactual summer is that like the average means but the of them the model that the Asian women value maximum believe there was any problem during the is runs and that you can also did you can open a kind of books up of the distribution of the schools across the and repetitions in this case so you can away X for this resulting objective
31:13
in the standard ways either visually or through a some statistical some of it not going to or more detailed told free components so the pretty good
31:26
tasks as session that ordered the 1 class which is to find on the island on the package and that the mention should be fine in the past as a for life which is that to a mention consists of saying what is about to buy a book on what the before kick
31:44
off and then should also supplied the sauce data which typically will be that the frame enough and then I can't optionally have and by the 40 states with so you have yet to examples like for with ideas that the set to a tie for 4th Test is viable using or all the other of for the Boston Red for that is what is called the outlook is a bet that the notion of predictions about within and is back to get the and them the workflows essentially a work for the solution to a preview and workflows should be implemented by an hour function so you should have hour
32:21
function that sold now so we set these were close are ordered this class that head of the name of the function implement the solution and then a execrably that need to be part of this country can be 0 or like exothermic to that need to be and optionally again at the end of again at a kind of internal like so you know just a few examples of this is assuming that there is a function is that except perimeter fence and want to apply to where with his from following the state's he'll talk about dysfunction so much and so that's a workflows and the days each of this standard look for a new results from the observation that most of the
33:11
time most uses the really what they want is to apply and not of the ball all which like a nasty and random Forest were ever took the so they the want to apply the UK will always implicating the to the trading
33:28
then used the resulting model to obtain predictions for the set and then we do is put it should stop some standard metrics on the tests that we descriptions of the flight accuracy parade ever so that I'm in my view that this is the most common set up for use as a pre and because of that she wanted to make this thing as easy as possible with in my mind is the spectators and that's why created this standard book functioned so that the function that allows you to say well at 1 really to carry on his approach to the solution of want to use these load and the rest from don't want to be by about and his tales of cost you can also add because the flexibility of a write your own work for funds because maybe want to apply for 3rd year data prepossessing staff and then to the predictions you want to play again over fancy a post procession where you can do that with a package but if you want to look like the kind of standards were well then you can use this country that is already provided by the standard of care so that the idea is function so essentially
34:37
dysfunction except several perimetre that allow you to control the steps of Dec said 1 of the main that where you specified that are function that England would not say as the and Runtime Forest were the
34:52
then it accepts another perimeter where you provide the odd that should be passed to this function and there is another 1 which where you tell what is the function that should be used to open the predictions which inactive please the and pretty and that the fault fusion of the 2 specified maybe want to your to
35:12
add your own at the very peak of a special of a way of paying predictions so that parameters to dysfunction there is another 1 that controls what is the function to be used to go create metrics
35:27
from asthmatics with the predictions and then by default have again to functions that go could stand the 1st occasion Magic's and standard regression magic so if you say if you don't say anything so that the function will look at the type of product viable if it is a nominal viable it will call classification Maddox with new some standard things like area and if it is an open American verbally to local that the need to calculate means that you can also control that through the missing the you want different magic so that
36:03
the idea to try to allow the user to control the streets steps and and
36:08
reasonable the default for each of them going to allow you to do a police possible work most situations and now of course at the match and domain up the at do it at humanity or very fancy very sophisticated all with more that include sophisticated prepossessing step for ever so you can do that also with package to can create your own userdefined work so that few constraints for dysfunctions essentially it must be an hour function that
36:39
except for Miller in the 1st are given that trading set in the 2nd Test set in the for these to are data for anything else is up to you so you can write your own function the old own work full function the 1st volume of for Miller the 2nd trading said the 3rd Test and then were ever its required for your work for the you implemented at 0 8 0 0 and I'll be viewed as an example and the and the
37:10
function must return and all but 2 of the last which should go down at least a
37:16
vector of the cost of the magic that are being used to make it can contain more things again even not put the ball corralled to put the the predictions but didn't believe that the mainland Singh is to read the books with the knowledge that supposed that that want to go to create the workload that essentially combines Elena regression model with a regression Tree that supposed that would and so why want to create a work for the given that trading set up and the entire aggression mobile a aggression the and combines the predictions that the UK can so like could create you know I'm not function that I'd just call my work that mentioned needs to a human that the 1st 3 are demands are like a former things said that the set and then this work will also close to that are the opulent which led says the way that I'm going
38:08
to use on the communication between the predictions of the billionaire aggression progression to what this that these foot the leader mobile with a train data a regression Tree with a training data opens the predictions of both of them will be a model under regression tree that combines the predictions using these where by default gives me in the same way to tool of the 2 more of them and then I'd just out food the means quota of this predictions that work for of crosses not very fancywork full but this could be complex as you want but you want a fancy a Brit possessing steps can included an estimates you have this found that it can call it in the same way so far as a simulation this workflow dysfunction the and wanted try with this primitive and that's the thing for you get the results of the any other and the world where the flexibility comes from is for for the allowing the user not only to use this kind of stunt workflows but also tools to write their own were flaws in the way of that 1 can sometimes not only you want to Web is workflows but he wanted tried several violence of the same works Alexei during the parameters of the and honesty the amara Runtime forestry for debt and is worth full violence function and that again what dysfunction does as it generates and set to work for and each of these workflows will be very ends with different during the values of work for an existing were for so you can for instance society that they had won the tries several Bernita's of and that the amount that a good say Well want work full veterans of these but workflow where the land is the is and 11 parameters on cost which is 1 of the function 0 1 of the parameters of the S the and now want to try this to value not only 1 of these to the and which again is a primitive of this and want to try ideas sing so what is that in the early days it will generate Holcombe possible communications between so it has been of no all communities of the primitive Alice so in this case it will generate for more as the emperor that automatic for you and then you you get the same sort of results and the cost you can do the same with your own full functions as though that the wanted tried the weight of the Elena regression to have to go through these said of the case so that the way that you can use this function to try different
40:48
automated communications of your favourite Wurtzel's of solutions and that the cost then that when you try to see the result to get like violent 1 of the and 2 and so on before the game and get the difference cost for days sinks and by the way he was playing with easy but data prices they will add don't want only the men's good there Iwunda misquoted demeanour Söderman demeanour Irish percentage of the disaster and the standards of this is already implemented on the back to back so you get the sort of results of cost you can also then ask well as the and the 1 looks interesting the result is what that the proceeds of the these and you get the information on the perimeter of Britain's perimeter setting the veteran of the world and you can also email below things like ranks of the of the Mobile's you can the of cost blog there is seen and sees no for for different Metrix and for the deeper violence you get to the end of the show Comparison of the cost of again using the across addition in this case you can also rang this things for the before that takes and cost you don't also the policy that the don't have time to go through a few of them regardless dimension that as we have already seen the annual examples were of using grossly nation through the pungent but it is also implemented old Altman runs of something basically repetitions Waldau we will not the nation several of to events of butcher but also it is also implemented this month but the experiment for time serious or time of order the
42:32
pace of its also implemented in the package now this is just 1 example of a cut but patiently to set indispensably using old also said that is a sensible but instead of having used setting that would although and that the changes the experiment Ahmedabad cost may be the parameters are slightly different industries and to specified the size of the old all of this is that the number of times I want to do is run this and it is slightly more slightly larger the experiment this
43:09
time I know instead having a single timestep vector task in this case to the undeniable vector of several work for over the vessel in this case I'm heading for ransom and the and the deadly 3 veterans of the regression the on these 2 0 0 0 in this case a cost patiently on these 2 classification and using free reputation repetitions tenfold costly and that the goal that is message Arandas experiment you get the cost them different ground for each of the past and for each of the medics in this case is just right and then you get discuss for each of the violence which of the 2 learning system again ranking getting the talk before most and stuff like that information on the perimeter of the same so that you have set your utility functions and to play around with the is all the the results from the Express now find a way out pocket the wanted to talk about her just to wrap up this thing would be to the good and the issue of trying to cheque if the observed differences between the different workflows statistically significant or not the case that some of those retirees disco by testing and then the decision the not like what is that there is no difference among the set of alternative workflows and that we typically used this kind of thresholds tool to avoid the sun degree of confidence on the build the differences but now most of the extra man to model in the UK to a sure independence among the different observed cause and sold out because of betting on the Beatles sought to radical background behind his 6 from talk about it as we have always we use the number medical talk from sign ranked as to to carry out single though it is also implement the product that the but you know what will not recommend that use for the rest of the so you can carry out an experiment and then take this all the time that obtained is the compressus so there is other functions the comparisons the pigs the of the resulting from visit famine and Calculate the statistical significance results essentially by defaulted fix the 1st 1 gets the average board and B Asian and then the other 1 the competitor against a day so you that the difference on and and that the value that is different is that the law not to discuss the case to get this sort of tables for each of the datasets in this is just 1 and for each of the evocation metrics and there is thoroughly debate is line where you can I select the debate is linework workflow to be compared to tools to which all the other ones and and there is no end of up to function that allows you to Bruges results by some Italy made on these devalue to get to know more table containing only see the destroyed significant differences according to some some threshold but that's basically and that was that the kind of makes easy to obtain is that they will that we all like to to put on a
46:35
bit but because it's very easy to for his the leasing and just put in the way table on or so that Somali right so that was a very fast and I was there at the end of the year that the UK 1 of the many questions