Merken

# Trade-offs in Statistical Learning

#### Automatisierte Medienanalyse

## Diese automatischen Videoanalysen setzt das TIB|AV-Portal ein:

**Szenenerkennung**—

**Shot Boundary Detection**segmentiert das Video anhand von Bildmerkmalen. Ein daraus erzeugtes visuelles Inhaltsverzeichnis gibt einen schnellen Überblick über den Inhalt des Videos und bietet einen zielgenauen Zugriff.

**Texterkennung**–

**Intelligent Character Recognition**erfasst, indexiert und macht geschriebene Sprache (zum Beispiel Text auf Folien) durchsuchbar.

**Spracherkennung**–

**Speech to Text**notiert die gesprochene Sprache im Video in Form eines Transkripts, das durchsuchbar ist.

**Bilderkennung**–

**Visual Concept Detection**indexiert das Bewegtbild mit fachspezifischen und fächerübergreifenden visuellen Konzepten (zum Beispiel Landschaft, Fassadendetail, technische Zeichnung, Computeranimation oder Vorlesung).

**Verschlagwortung**–

**Named Entity Recognition**beschreibt die einzelnen Videosegmente mit semantisch verknüpften Sachbegriffen. Synonyme oder Unterbegriffe von eingegebenen Suchbegriffen können dadurch automatisch mitgesucht werden, was die Treffermenge erweitert.

Erkannte Entitäten

Sprachtranskript

00:04

the on and on about me and my family and I wanted to adjournment and also thanks to all the organizers of fall on radios all cheering for the invitation the headed to talk yesterday and the state of Mato giving is a subset of the data of the workshop and talk about trade-offs in statistical running condition trade-offs that not only much but traders sold took about a you of a few of things up I should mention that services work medical officer all over the world and so few people in the region someone 1501 injection with a gun and Jordan and so the yield but people talk about the data and I think you know everyone has their own idea of what it is contingent on you can describe it as a bold phenomenon the collection of changes in their science much like global warming is that it's hard to define but that's a lot of things were working on and there are several important aspects but I think it's 1 that so that we know we have focused on the road the kissing statistics is that there is really connect could collected all along and it's not always a very useful for most of the data might not be relevant to the polymer 10 and debate I can use a very complex at major very different home if there might be a lot of issues attached to this day that we might want to keep it private undermined units of errors in it the comments of history it's inherently when we had all this complexity trade-offs and on and on so yeah this is 1 of the reasons why intervention until elections later and you have a lot of big numbers the idea is that when we have a lot of data like this and not trying to say something about the very high dimensional parameter so DeGette and a very large and can be very large but we'll find introduced assumptions about the mothers of structures sparsity we try to make up the pollen Tractebel from a statistical point of view by introducing all this and there was this decisions answers tend to be optimizes Parliament this creates a combinatorial columns when you just focus on making good methods so if bring estimation and you want to solvents musician :colon this might be intractable shouldering officers testing in 1 compute averages on very large sets this might also be intractable and so this is this is indeed a barrel when all objective is to do computationally efficient statistics we want to be able to 10 petitioner what's going to happen when they're going to have good method in the computer on and so on In order to Duke of statistics in a computationally efficient way of that the vote was the worst case harmless interacting with Parliament smallness is 0 for all all the information it tells us that the 1st thing that we want to do the office India it is not going to be possible all the time but it doesn't say much more centuries 1st of all it's possible that there is an agreement that exists and that we work for frequent instances of Annandale pollen still in custody non-complex regression the time can do a stochastic gradient descended into eternity unionization if the right conditions on here and this would true for natural signals and in general then we can solve this upon himself fall in other instances also it's possibilities we didn't create another optimization Parliament have epoxy functional optimization parlance this is what we're doing we're doing complex relaxations on what we're doing the usual trick of transforming the unity on 2 and and 1 the and so in order to us in order to say that something is going to be high on the day of the computationally from a statistical point of view we have to focus on average this hypothesis of pneumonia was the Solomons to but some test is hard to achieve our consistently and efficient went so the big question is can we have located in the 2 right to vote and not only bow in French so can we have good statistical performance and going efficiency at the same time on and so on and this is this is a figure that I think about if people have started using describing the statistical error mourners as a function of the running time so above this line is essentially the things that you can do for which door on government right so do you know this is essentially the information theory to threshold that says that no matter what the running time is you can't crucial statistical

05:16

error the image some some particular threshold of but there's been more and more work that suggests that actually this statistical error will increase on when the constraint on the outgoing excited on so there are 2 kinds of results around that again the point of view is I understand of results the first one is to say what we have in mind what reasonable computing timing so there's going to be to regions to be the 1 where we have infinite computing time and so the would be information theory and then there's going to be some or all of the regions where we know how to compute things up and then there's going to be another rule about that might be higher than the other ones so this is kind of a binary results but in this in this kind of sitting where would to get no amount In the other 1 it's Moss move it's more on the final analysis of discovering some sense that some work but to put extended they do more and more relaxation so they get more and more points on upper bounds on this side of the car and those other cases where people who get a particular competition mothers so they have more on a successor to describe what's running time means and able to show that the statistical error for that particular procedure will increase when this is the end of picture I want you to have in mind the keystone I case when working at statistical error and agreement efficiently and at the same time so I just to give just to give an idea so full of people who oversee this before this since this is the sister of recession began asking for 10 minutes and we get after so this is a spouse Principal Component Analysis in particular sports principal component of detection so we have a X 1 X and independence random vectors that all center and the question is on his back these sectors have anisotropic distribution all is their response the direction the weather is more violence in off the dime right so this is different from just Principal Component detection where the candy anything in high dimension this is going to be a very complicated :colon there's a random matrix theory tells essentially that we need a very strong signal we need that the added violence but grows with the dimensions of this thing here really problematic in high-dimensional setting so this is why in some sense we focus on sparse vector vectors that I'll make this an aligned with the axes and the signal strength 25 is in some sense the distance between the 2 nations have a different also of course of the day is very small this is going to be a hot pot it's very large India that is the problem right and so we have some results are actually so if you want to look at the statistics point of view of the 1st we have a really tight result for this essentially are we take the empirical covariance matrix and now we maximize this quadratic form although union-sponsored actors so if share we had signaled population covariance matrix of would get 1 on the nite and 1 press states up under the alternative and so we just have to control the deviation when the deviations of this statistic when we don't have the population covariance matrix when we had the empirical work on when you do an analysis of this and you find that the standards to be view of this all we will have GE shall we have key which is in some sense of the true dimension of the upon news of the garrison factor but this analysis is that from an information security point of view of we have matching abounds mourners of the same but of course we don't know what graphically how to optimize this when we do an analysis with relaxation so the usual trick of substituting a response at assumptions by and 1 loans and with an added relaxation of some doing on transforming lifting it in the something different positions from Altman have a relaxation but few people realize that and more in in the Indian but we showed that actually you read it requires on to the effective industry-leading I'm used to it but this is a much stronger than the stronger signal fit that needs to be of the order screwed of key square this finds this fact undivided by so this consumers better than when we 1st of sold and the Maxwell where you need fit that any greater than spirit of overran I that it's serious about tomorrow's results compared with Sophie goes to good to be slide that had the share this is just saying what we have this area here and we have a civil to more results here and we don't know if it's because because of history here always because or analysis is not really good long because this method is

10:32

not right so the overall picture for the testing upon and that we have is that computationally efficient tests seem to require this at 1st and so it seems to say that she of course there's no detection it's possible she there's a communitarian method that works in here but there is no on putting a brilliant time methods we stopped working but of course this is just a

10:57

suggestion right now this is just another mound that we had are the true situation could be very different there could be another 1st more than 2 such that there is an agreement that starts working at this at this level of signal for life at between 1 and 2 and so on In order to show that this is not the case at all we need to we need to look at complexity theory to grow abound so that is a bit Lake information fed go lower bounds when we're doing minimax analysis and statistics but this time taking into account the going efficiency but the testing procedures in order to ensure that this is not possible at all we have to use some assumptions from computers and we have to use the fact that some continuity is Honda on average I think so but the

11:49

problem that we're going to have look at is the plan to keep Parliament so it's it's very easy

11:55

to describe what it's about random graphs that a possible kind of random arrests were each edges randomly connected with probability of 1 had independently so seriously distancing matrix and when you think that reducing the Matrix essentially Cuomo's a bunch of obtuse nutria idea proficiency so and its

12:18

petitioners are about to hear constant 1 on and

12:22

another distribution arm is the practical distributions of Islamic age so a subset of completely connected vertices sharing knowledge is planted in a graphics added so in the adjacent the matrix is going to be a small square that again now it's if so of course on the small graph-like this it's kind of easy and to see that there is 1 hour but in general on how will we can say is that this fundamental instance would have no expectation that has sponsored no structure meeting the original part and that we hadn't passed this year when it was either something where everything was completely independent was also nice although as a sponsor none of those present are they and we need to differentiate the 2 possible cases and of course just schematic peaches are presented as 1 square but of course it could be everywhere and so that's what makes it so truly hop-on and because we would have to go everywhere to fight captain so if we just look at this point and from you know purely on probabilistic statistical point of view of our own the disciple offices you can show that the 2 of the largest fleet would have a size smaller than too long and with a very high probability and under dissident didn't have size greater than kept at bay the Commission because we've planted a kick of sidestep so of course when the vise president to Logan 1 it's possible In theory on how to detect the presence of the sky but of course it's anti-hunger Honda to check out the value of this number so people have looked at all the methods to try and detects even estimate the position of the Stitzel so the fact is that you have some premium time detection method that works with requires this size to be greater than Stroock & and is the size of the Army is the number assesses the size of the crowd come and there's a very strong ADB officers entered through computer science that says that essentially it's impossible to do better if Eric Chapel is about love and that's more than one-half then there's no premium time method that works even to distinguish needs to distribution so on and so this assumption is considered to be strong for a few reasons for support of a very small people of trying to improve this result and haven't been able to do so that's the authority of argument but that other people have also shown that if you focus on particular competition on mothers seoul Markov chains of all kinds of relaxing complex annexation for this Parliament screwed and is where things start to fall apart some and not keyboard views this as a primitive to pull the whole mess of all the things so to prove that it's hard to approximate Nash agree agree on that summer cryptographic systems are secure so it is considered that the euro and this is something that we can use to show that all the columns in the statistics on Hong we appreciate even tho average reduction will show a link between columns and will say Well if there was an algorithm that would work really well for this statistical Parliament that that we have sought exports this year that I talked about in the beginning of where we could modify up we could take we could take a breath as input modified it and then put it into the that we have regional statistical parliament and it would contradict this hypothesis please and so that's what we've been able to do with being able to show that you can make components of to he can and and we've been able to show that any kind of detection you'll this threshold would contradict this this so the decor message here is that when you focus on him computationally efficient method of for the affection that is in this Parliament there is a gap of square the rights to indicate that there has been leading the armory in these 2 regions so in the computing time and reasonable computing times other gaps between the barriers for statistical sheet they are not the same

17:08

result is true for estimation so this is something I Weaver with worked on with some critics at Cambridge later on and the picture is a bit more complex because then the all too metric and about the size of the signal and then there's this kid of convergence it's not just candidate I it's all clues can you detect and we show again that in the 1st sentence similar incidents in regions I there is a link between estimating vision and recovering the who were able to show that many when you focus on the again competition the efficient methods there is again a gap of square yet but so here so conjectures people have given their personality at the center of the plaza try to solve problems absolutely but I mean you could also argue that she different from interviews and improved conjecture and so maybe we can compute and the connection between the entire show that that was the start of this resource it's not know it's not as hard as it's known as secure in some sense as she wasn't it's possible that this conjectures force and that no different from and to to dispense however you it's been shown that you cannot say that if you want to if you want to prove lower bounds of all things in the average case in general but it seems very likely that you cannot use worst-case assumptions and despite the technical details of the water Yaqui of ponds would collapse if that were true soul it's it's assumed that you need an average case assumption you can't prove these results just based on 2 different from so you always going to have to link different parliaments an average case with some possibly distribution other questions before I threw the response to steer scientists and this is the home of 1 of the world No that's a very good question but it's not it's it's not clear that it's always equivalent when some apartments those are very large you can show that you can deduct the deduction Delaware as well but it's not in very natural order parameter regions looking 1 so this was 1 particular upon and why you cannot have your cake and eat it too but of course this is not the all universe situation In the last few decades we've seen the power of complex annexation of statistics and machine-learning we see also that more is more recently it has been anything more focus on that the fact that a lot of things will work by magic of the time have affair as Gigi is not supposed to work out it will work I actually pool that will work at the time so there's a lot of situations where you can actually have these 2 things at this same time man other people yeah I have worked on a showing that you can have statistical performance and some other thing as well it doesn't have to be a computationally efficient on so some people have worked on privacy data security robustness to others the fact that data can be distributed so we can have statistical performance and some other thing and eventually whose origins of statistical performance so the question that I had in mind is what you have all 3 can you have your cake and eat it too and the cherry on top as well on Monday and I and I like to give us the ability to have a negative results in general so I focused on Parliament where you cannot have all 3 at the same time together have to if you want but you can't have statistical performance an agreement efficiency and enhance the Disuq performance and some of the thing but have noticed that in in the event of this analysis of the time on the issue it is shown that this works for all the energy off or something that has a lot of strong assumptions of but I've wondered you know can you show that sometimes it doesn't work for meters in particular and usually doesn't work for me for some I regret making efficient estimated so can we on the cases when we have this too but not this so 1 example of this section of pretty natural that like to talk about now is between statistical efficiency are great efficiency and the fact that you can't distribute yield data so this is something I with related to sponsor matrix cancel this is a rather magicians of to have distributed data on 1st of all it can be as worldwide said that yesterday to be to stall or move around but you can also have you can also have all the type of constraints so that at the time cannot be shipped 3 so you know you have a country and you have datasets in different parts of this country of some and maybe you know these are hospitalized and not willing to show that but that there may be ready to share 1 average or a small part of the data the question is then likely do as well as if you have all the the data and so this is the this is the case for picture industries but you have X 1 extended all distributed with summer according to some distribution and you want to estimate something about this distribution so this is when you have all the data and you have some good guarantees of 4 the had to deviation bound guarantees with high probability but the question is when you distributed and is when you cut in the relative small blocks computer and blocks with an overt and sample can you do as well when estimation can only be done locally so you have a bunch of local tomatoes and then you aggregate and so he I wrote an average but doesn't have to be an average Canyon or consider all kinds of things and still get here it's assumed that all these other blokes on and so on of course there are practical advantages to doing things like this so 1st of all as I said in the beginning it can constraint on union men have security issues also raised issues but also you have a running time advantages when you do things like this right if you'll ever to compete he distributes order computations we divide computing time not and even if you have to do it no 1 out 1 by 1 even if you have to do 1st

24:15

competitions for this 1 and then for this 1 and then for this 1 Axelrod and then it was to speed up computations because your you have a smaller sample sizes to begin with and so if you have and to the 1 full of drunks and the running time is and to the powerful then you can improve even when you have to do with them 1 by 1 this power the cat so of course not issues when you do things like this you have a loss of a loss of signal right each article resonator is based on less Sampras so the question is how does it and of course it's going to be different for different policies children average when everything commutes plants or share when your computing this it would be equal to the average of them but fall doesn't commute like this and it doesn't scare when people are focused on Iraq recently all positive results for this saying that there is a ranch of estimators where you can paralyze things was things based on complex musicians so if you have a strong assumptions of on enjoyment of you can show that John unionize the geometry of this Parliament and show that true you have minimal loss when you do things so the natural question is to know that I had in the beginning is can this be extended to any kind of history forfeit them in but to get on situations where you cannot compute the Yemeni when this is not a complex Parliament also can do anything and so on the plan and found that is a pretty natural law but that raises an issue that is producing policy have been matrix of observations the million or so the signal is a transpose fall a sparse vector the parliament and the noise just a bunch of honorary degree fish annual matrix is Q and so on just to fix on the skating we assume that the in the signal at all constants when there is synonymous with poverty children but most of the time they quotas so is more as the number of these unknowns awkward questions parliament and the northern they said is independent coefficients would send revised 1 has the usual suggestion to and in this Parliament baseballs and we fix this small city to be constant and screw because this is where interesting phenomenons the coming up to this level is the last 1 was the loser the In such a case that has become a lot more long sojourn in the half and 1 element of it can be constant for the year that analysts yes yes yes yeah the and fixing all of this because this is where the real interesting stuff happens also to be 1 of the reasons that were in you know what this is it this is where I think start to be interesting when you stomp to distribute it but I mean you could change things a little bit you could say that fight on is smaller than this and then you have to change this city and then the same kind of phenomenon we observed that this is 1 example of 1 interesting things 2 and so all when you want to do it she has estimated it or even something very simple and the meaning of the Act I Scene can think of this as a steadier than it did in each of us to have a bunch of you have a bunch of qualifies for patients and you don't really know yet what the disease looks like so you compare all their or their fight and if you have all the patients finds have this big matrix of observation and the idea is that whenever they both have the disease when other people look-alike and something specs I think this is a list of the polymer 1 can but of course we don't actually have access to this data the stations on something cities of the country of the poll on hospital and this is not going to be shared so the only data that actually exists is this 1 and so you have entered the at sign for any constant at Sundance blocks but in which brought you have an over an hour and Paul finds all and so the question is of course you know is the statistical Holland still possible 1st of all is impossible when you haven't distributed the data then is it possible when you do this and then how does a very good make efficiency our community so I 1 thing where we can see 1st of all that they are going to be issues is that even if we do a very simple U.S. for I thought the mean of the fight some of the provisions were going to see all wife things become complicated when we divided the data and so some or all of the provisions of this matrix because we've added that we've made a of some coefficients higher so the overall on you should have should be higher as as well but we see that we estimates of up to some sort of consensus with high probability and so if this constant that comes from the sparsity so dispositive sequence group and and if this constant is large enough that we can detect the presence of signaled the end we can have a pretty good estimates for so this is when we have all the data this is the baseline resort to the benchmark in some OK but when we kind of debate that at this hour than sparsity locally when you are when you compare it to the size of the bloc has changed when you zoom in essentially things look more spots cannot he cannot the computations and you see that kid silent so the sparsity in 1 block stands With an at sign and the size of the rock we a smaller constant then 1 had and so if you try to sum in each individual rock band added mean would be smaller than the size of the deviations and things will not work turning so you can be you know challenger what I'm saying is that by saying Well 1st of all media it's just you know a simple heuristic maybe this 1 doesn't work with other things would work and you can also say what you know of course you made upon and Honda now there's this data so maybe you've made the parliament possible by the having their stated that so 1st stuff will see that actually you don't make upon an impossible when you look at the data this it's still possible to estimate a when you only have access to the smaller than the blocks like this so the 1st we do the analysis when all the data is available and we stop living in the United States is going to be the 1st based perspectives students over for now we don't care about computing time so we are assuming again that we can maximize this quadratic form on the same type of results that we had false passed this year was the moment we have a deviation fall on this factor that will be governed by 2 things but the spectral gaps fall the Matrix state and by the enormous some norm of the year the added noise pregnancy can control this norm of the analyzer can controlled by the king of the parameters of your opponents of the spectrum gap and you find a balance of that goes to 0 when and grows and so you have win and is very large a good idea of what this it's this allows you to create a good candidate set of non-zero of a proficiency so these are people who want to have a strong assumption that they have to dismiss the show there has a good intersection of the true support of and by doing this then you can do a 2nd phase of refinement and you look at essentially the poll finds that are highly correlated with those in W and intricate these highest provisions and this allows you to recover the horses so this is when you have all the data that the same analysis with the parameters changing just assumed it was still work when you do it's locally or on each block you still find that this will grow on trees will go to 0 in on the cross and the speed with United's store of course we have is that the center but it still works and was still able but this should be a W at s right we have in each block 1 candidate said we do this 2nd phase also in but and works so the plan is not being made impossible order from a statistical point of view by cutting up the data like this there of course the method that presented here is computationally so now we could say Well maybe this Parliament is computationally hardware or not it's addition the set In an action it's not so you can do the same analysis from just spectrum effort so you forget about the response to assumptions here on and so nice to have a different moment on because this most maximization pollen is less constrained but you still have the spectral gets here and when you analyze this on so this is where the chosen a K all this growth and you have something here that is less than 1 so you have no 1 Treeview angle between your estimated that around the true vector I this mean that this that the candidate said we have haven't still pretty good intersection with the true to and the 2nd phase was work the only thing you need for the 2nd phase to work is to have something that's not like 1 of those privileges in Anderson also small constant is fine because of the parliament initiative is not hard computationally changes to spectral analysis and will work and you can do it in addition to demand a response spectrum the question is can you do the same thing as and when I went from this slide to the side and said there was only a minimal loss that can you then the same analysis in which broke so of course since I'm asking the question it's not possible or are you do the same analysis of the operator on of the nite of the spectral get engaged more right than you find this but the Bond series nonconformity this is a distance on the steering and some sense this is abounded distance so if you have abounded goes to infinity doesn't tell you anything that could be a and in any effect on this year's satisfied and so on you cannot do the 2nd phase and you have no information so again this is saying what this prosecutor method doesn't work but you know maybe something else something else will 2 still objective is to have a statistical procedure for 8 that satisfies vise property so this is kind of the picture there shouldn't being between having your cake warming to offer it to win having something else on the site so you can have these 2 things at the same time Soriano again so the statistical Parliament and have the data being completely disappeared this is the small spectral method you can also sort of statistical Parliament if you have all the data it was just a simple spectral report the if you want to do all 3 at the same time so far Everything we tried doesn't work end so you can do the same thing as India as in the previous parliament talked about actually I'm so the global idea is that if you want to estimated 80 a in a distributed manner it means that you know as the meeting all the AS so if I go back 2 yeah this picture right so this is the only data that exists and so if you want to estimate the whole matrix or the vector the only way you can do it when this is the data available and you going to do is to Mission Oconee on each of these is to estimate on the day as well but there's a there's 1 part if there is 1 region warning recovering the support then this is going to them this is going to be upon as well when you want to estimate the water spread so if you could estimated the water matrix In addition to the way it would mean that you can estimated more blocks of individually In the end now the signal strength in each individual broke is too small and dispositive of it's small as well and these 2 are related and in particular how it's going to be a hot enough for computationally efficient procedures so we also do a reduction from the plant to keep Parliament that I talked about in the beginning and we showed that if you have these graphs of these 2 possible distributions on grass when you look at the adjacent see matrix do manipulations at some summarized to the provisions you will transform the adjacent matrices that come from this Parliament 18 instances of this part so the connection I think is even more evidence then that was fast asleep and in particular you we love you wish you ensure that if you had something that worked for the small matrices 40 children estimated that then you could use this to detect the presence of kids on very small heaps of interest which would contradict this assumption print so In the end we don't have that they think were able to show that this is the picture you can do statistics agreed with the efficient way to injuries distributed where'd you cannot do both at the same time anything so I mean this is more of a big picture messages that I want convention that when we look at it when we look at statistical problems and we want to say that you were able to add something tall statistical procedures have this it's something that we add it should not be considered individually we should consider all of these at the same time because otherwise we have multiple trade of so this is where the center of my talk was trade-offs grown statistical learning we can have our impossibility results when we want to have a lot of things at the same time the statistical there is another there's another type of trade-offs that I 1st as well so this is a work in mentioned in the beginning with the children and and now the other thing that we want to add in this problem is not to be distributed distributed computational distributed data for business to and so it's a statistical and it's really bit more of a tarpaulin indigenous less natural than the 1 they talked about the 2 ones and talked about the performance related to us that stated the deal even without going to march into details what is that in this Parliament to use statistical performance can be reached later Jihad methods so this is their exports this year this is like the Parliament 1 sponsor matrices that I talked about in the beginning of the end to improvements can be made to this to this Parliament with smaller statistical lows but they can be made individually and so 1 of the things you can do it you can use another statistical procedure there will be computationally efficient I would have some statistical lost with this procedure there would be and the other thing that you can do is you can now for constant portion of an can also change differently no statistical procedure on and now if you know even 99 per cent of your data is Jewell trash and then used to be able to detect complete noise from the presence of a little bit of signal in this part and the issue is you cannot do both at the same time so again in this this this type of picture when we can have a statistical efficiency would 2 other things individually so here by the robustness to arrows or competition efficiency that every the only way we can make an improvement is to not be able to To the other ones are the 2 improvements cannot be made simultaneously and so we actually have a negative result that that tells us that there is an issue of this Parliament we tried to do it 2 at the same time so it's not related to the predicate :colon it's related to another parliament charitable computer science cordoning parity written another 1 of these respondents that is assumed to be harder on average only thing you will you use you know and I mean no it's not it's not mini-games much a different from and Geneva it's just so the border 2 bullets flying and there not be any sustained can institution as so people have tried another unit will not be in which to improve the results and the people of all social and this is actually with the 1st kinds of competition lower bounds that existed that if you focus on 1 particular model of the statistical crazy you count the number of craze that you can make to this is going to be this is going to be problematic in this is kind of related to the Saudi literature on the statistical algorithms that it and found that some of things so centuries result polymers that are home when you all only using averages of yours that there is a school of Malaysia yet is that all this is a list of things I knew where that would be they fully defined the history of the political successes competition lower among precisely saying if you look at if you have an agreement that includes the presidents of you assuming bound the hands then it would contradict you can construct another garden and that would contradict something he is the nominal somebody there will have perform better or so than thought possible in another part always the meeting was that the company could that at the time of the year is that because that the local I always use this kind of thing that and that's when he was absolutely but was a resident said the 2 onto whatever that this because this is true in all competitions uninterruptible are well aware that this is an issue that they've shown that it's very late he so you know I unless toward putting them in article says it's very likely that you cannot show hamas of learning an average by using just 2 different from empty all the worst-case assumptions to essentially ending although not all assumptions on costs owing to different from injuries much much much stronger than the assumptions when using ships and so on but it is still an improved conjecture but it's not as keen as a statistical where you have you have downloaded to the order of valuation this and so of course it's not as keen as bad and you know there's no the grass until the methods that tells you how to construct a about friend statistics you have all these Duran's attorney Andrew computer gear a computer screen that Madden says and you find out you find still here there's a bit of a recipe which is essentially you find 1 of these ponds are encouraged to Computer Sciences assumed to be the Harlan averaging he tried to see if it's really you know benefits if it's related to the pot all you reverse-engineering trying to find a statistical Parliament that's linked to 1 of these assumptions and but it's more something that you always going to have to do if you want to say anything negative about the computation about you statistics than in the competition and efficient manner and I I agree and this is a think of in the next conclusion side of it be interesting to have a fine analysis of these trade-offs on but I think this is this is always kind of related to us 2 of the bombs and it's nice it would be nice to have so I'm going back to the thing is to have a bit of time to education yeah it should be really great to be able to see you know with proof without any assumptions 1st of all there is a bound that's different from Justin information retrieval and also it this tribunal the very fine manner how this works well executive disbanded earlier this this would be really director from petitioner's right Italian I had this much running time but this is the added argument there no that way able to manage all history of 60 but it should be possible when you focus on a bouncer at all when you focus on 1 particular agreements and you estimated going to be based on the duration of some kind of gradient descent or something else and then you have to you and you can show that may be a lower bound for this particular optimization upon for this particular it's sobering American going back to the end and so the results that we have here in computational units to during its mission in distributed manner or in a manner that's robust to a lot of errors for 1 thing and that I didn't completely talk about because essentially was still working on it is you can avoid some of these issues if that's what you're interested in by adding some redundancy so here in some sense the very nice are the very nice aspect of having you'll be done this is that if someone gets their hands on that then they cannot say anything about about the war dataset even if they have enough 1 of Amsterdam's from it's in an infinite it's going to be fine so it essential if you are not replicating the meaning and will data by creating these belts essentially that don't exist yet on you could create a situation where are if anyone has access to 1 of these blokes individually in reasonable computing time they're not going to be able to extract anything from the data so in some sense it's a secure but you can still you can now do things because essentially now that would be existed you some audience coefficients and essentially agency that communicates with orders data centers would still be able to say that something is so this is something interesting if this is something that could be nice if you're interested in new body position results fixing these are issues and then there are more questions about you know if you or if you think that this is actually a good thing but if you think that the y'all dataset now I sent in some senses demanded more secure but no 1 being able to Ireland but 1 being able to look at it and extract anything from it can you systematically made their assets more secure can you have a dataset transform it into something that looks like what I had this past symmetric us in Parliament and no make it essential to someone who bound in computing so this is a big related to issues include direct more from a statistical point of view and then as I said and when 1st came to this side of Martin may have defined our analysis of these trade-offs on and you know what are we going to have to pay in some sense I want to have a refinery medicines administrators have to focus on 1 boat Europe 1 mother of owing to have only upper bounds and so it's complicated the to the role of developments already in this very you know binary manner with a lot of conjectures Excel it's going to be even harder if you don't have 1 In the end year this is possible to reduce interest so thanks a lot for your attention also on the rise so traditions of the University of conditions that this is the 1st year isn't the only got hold of the USS also called the investment know this it will also exist yeah demands an idea 1 of his students has work on that in 1 of my slides and traders and something and I had a reference so yeah so the plan to keep Parliament they've shown that they have a method that works well for estimating repeats of size groups and times small constant and they have shown that if you have sold the same kind of message we're talking about local methods essentially based on looking at these trees you walk around looking in the graft them unable to improve their constant so Del all Valle methods like this I think you'd be interesting also to look at look at situations where no saying and you we have will only use ingredients on pensions all investors Axelrod but the data is distributed is that creating the center on thing that the really interesting is based on information fury on but you have to make some assumptions about this thing that's although an interesting yeah this is related to income and any questions about so the U.S. is the right yesterday and descended fine-grained he's here the which is front of difficult because he feels he has very good tools to get abounds in Tuen Mun and advances in Cologne on colonial and in the interests of options have listened and you also have to appreciated the great quantities in arms which is cool things about making assumptions which is still a proxy for the run-time things like that and we were know the communication and the electricity Holmes imagines 1st TCA if you show that you get something all too long in the sitting New England institutions since the banks of the dimensions include argued that that kind of commitment of all the attention as happens with the tension in the Sulawesi this Liukin understand that much again firing now so honored I don't really agree with your comments at the end of the this season and it would also was describing what I was saying you know you have some given method of framework so you know your creating a epoxy as we said before running time I and then you have bounced based on information curious thing there's been no amenities recent work that I'm aware of on sponsoring in the regression rebounded memory and there's been a lot of things like that when you're founder of computation the only issue that I have with this is that sometimes you have to make citizens statistical algorithms you have to have a framework in where did you know you're saying 1 this framework and able to show and inform you know true bounced through abounds based on information theory that can suggests In this kind of trade-off between running time Simon but actually find something Adelson ammunition that doesn't fall into this framework and that actually allows you to solve this part so then you have the other Ponemah it's hard to argue that no framework is but true description of Running time on so this is that this is something and Nunavut worried about being a framework agreement is doing with stands and then someone noticing action something really easy doesn't fully the questions by the way services from these citizens into the area where because of this reason and that was the only person that could read here as all if you fall since the fall of call the purpose visit for information on the the will that so I don't know this is something even looked at essentially having a continue this is kind of hard essentially as soon as you ask them to share in a little bit of information and suddenly you can recover everything so this can be good if you know your central regions and you really wanna know it hasn't been going on in the deadliest .period you have to shut Eastern Europeans will make sure that it not stays in some secure that it says no 1 has all the data cannot fall into the wrong hands on but really creaking can recover things so this is good in some sense the police there with yet the procedure results disappointed that hearing yeah but they're trying to do it move away this kind of holiday at Houston this month it's also an issue about at that time thank you

00:00

Mittelwert

Abstimmung <Frequenz>

Punkt

Exakter Test

Minimierung

Familie <Mathematik>

Zahlenbereich

Statistische Hypothese

Komplex <Algebra>

Stichprobenfehler

Statistische Hypothese

Computeranimation

Eins

Gradient

Schwach besetzte Matrix

Algebraische Struktur

Einheit <Mathematik>

Exakter Test

Mittelwert

Globale Optimierung

Lineare Regression

Schätzung

Statistische Analyse

Kombinatorik

Ereignishorizont

Tonnelierter Raum

Figurierte Zahl

Gerade

Schätzwert

Lineares Funktional

Parametersystem

Statistik

Mathematik

Higgs-Mechanismus

Globale Optimierung

Schwach besetzte Matrix

Entscheidungstheorie

Objekt <Kategorie>

Teilmenge

Menge

Rechter Winkel

Konditionszahl

Parametersystem

Injektivität

Algebraische Struktur

Ordnung <Mathematik>

Aggregatzustand

05:15

Resultante

Distributionstheorie

Punkt

Statistische Hypothese

Analysis

Computeranimation

Eins

Richtung

Gebundener Zustand

Karhunen-Loève-Transformation

Statistischer Test

Vektor

Statistische Analyse

Ausgleichsrechnung

Randomisierung

Gebundener Zustand

Distributionstheorie

Statistik

Hauptideal

Nichtkommutative Jordan-Algebra

Globale Optimierung

Störungstheorie

Schwach besetzte Matrix

Teilbarkeit

Rechenschieber

Arithmetisches Mittel

Rechter Winkel

Einheit <Mathematik>

Ordnung <Mathematik>

Standardabweichung

Nebenbedingung

Subtraktion

Ortsoperator

Multiplikator

Exakter Test

Kovarianzmatrix

Hausdorff-Dimension

Vektorraum

Stichprobenfehler

Schwach besetzte Matrix

Gewicht <Mathematik>

Endogene Variable

Abstand

Stochastische Abhängigkeit

Analysis

Glattheit <Mathematik>

Logarithmus

Stochastische Abhängigkeit

Schlussregel

Vektorraum

Kovarianzfunktion

Unendlichkeit

Quadratischer Raum

Quadratzahl

Flächeninhalt

Gerichtete Größe

Stochastische Matrix

10:31

Gebundener Zustand

Sterbeziffer

Heuristik

Statistik

Diagonale <Geometrie>

Polynom

Exakter Test

Logarithmus

Markov-Entscheidungsprozess

Computeranimation

Gebundener Zustand

Übergang

Statistischer Test

Exakter Test

Physikalische Theorie

Ordnung <Mathematik>

Analytische Fortsetzung

Kombinatorik

Analysis

11:47

Mittelwert

Resultante

Distributionstheorie

Matrizenrechnung

Punkt

Polynom

Ortsoperator

Zufallsgraph

Zahlenbereich

Statistische Hypothese

BAYES

Physikalische Theorie

Computeranimation

Schwach besetzte Matrix

Graph

Knotenmenge

Algebraische Struktur

Erwartungswert

Zufallszahlen

Adjazenzmatrix

Distributionenraum

Endlicher Graph

Randomisierung

Zusammenhängender Graph

Ordnungsreduktion

Kombinatorik

Stammfunktion

Clique <Graphentheorie>

Feuchteleitung

Parametersystem

Statistik

Verschlingung

Winkel

Globale Optimierung

Karhunen-Loève-Transformation

Kette <Mathematik>

Ordnungsreduktion

Wendepunkt

Teilmenge

Quadratzahl

Einheit <Mathematik>

Verschlingung

Rechter Winkel

Erwartungswert

Punktspektrum

17:06

Distributionstheorie

Einfügungsdämpfung

Momentenproblem

Knoten <Statik>

t-Test

Ungerichteter Graph

Gebundener Zustand

Deskriptive Statistik

Negative Zahl

Vorzeichen <Mathematik>

Gruppe <Mathematik>

Lineare Regression

Statistische Analyse

Offene Abbildung

Phasenumwandlung

Addition

Grothendieck-Topologie

Kategorie <Mathematik>

Winkel

Güte der Anpassung

Stichprobe

Schwach besetzte Matrix

Ereignishorizont

Konstante

Menge

Verbandstheorie

Rechter Winkel

Sortierte Logik

Physikalische Theorie

Beweistheorie

Ordnung <Mathematik>

Explosion <Stochastik>

Subtraktion

Folge <Mathematik>

Wasserdampftafel

Relationentheorie

Geräusch

Spektralmethode

Stichprobenfehler

Schwach besetzte Matrix

Koeffizient

Perspektive

Endogene Variable

Zeitrichtung

Abstand

Analysis

Erweiterung

Matrizenring

sinc-Funktion

Indexberechnung

Resonator

Unendlichkeit

Summengleichung

Mittelwert

Resultante

Matrizenrechnung

Punkt

Natürliche Zahl

Hochdruck

Gruppenkeim

Axialer Vektor

Element <Mathematik>

Komplex <Algebra>

Inzidenzalgebra

Übergang

Eins

Einheit <Mathematik>

Maßstab

Exakter Test

Bewertungstheorie

Meter

Gradientenverfahren

Kombinatorik

Clique <Graphentheorie>

Parametersystem

Perspektive

Statistik

Verschlingung

Amenable Gruppe

Reihe

Stellenring

Globale Optimierung

Ähnlichkeitsgeometrie

p-Block

Frequenz

Kommutator <Quantentheorie>

Teilbarkeit

Arithmetisches Mittel

Rechenschieber

Funktion <Mathematik>

Verschlingung

Gerade Zahl

Koeffizient

Phasenumwandlung

Garbentheorie

p-Block

Geometrie

Standardabweichung

Aggregatzustand

Schätzwert

Nebenbedingung

Hausdorff-Dimension

Gruppenoperation

Matrizenrechnung

Zahlenbereich

Punktspektrum

Topologie

Mittelwert

Schätzung

Stichprobenumfang

Grundraum

Leistung <Physik>

Schätzwert

Einfach zusammenhängender Raum

Mathematik

Logarithmus

Relativitätstheorie

Konvexer Körper

Vektorraum

Ordnungsreduktion

Objekt <Kategorie>

Maximum-Likelihood-Schätzung

Energiedichte

Quadratischer Raum

Flächeninhalt

Parametersystem

Mereologie

Normalvektor

Einfügungsdämpfung

Numerisches Modell

### Metadaten

#### Formale Metadaten

Titel | Trade-offs in Statistical Learning |

Serientitel | Computational and statistical trade-offs in learning |

Teil | 08 |

Anzahl der Teile | 10 |

Autor | Berthet, Quentin |

Lizenz |
CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. |

DOI | 10.5446/20251 |

Herausgeber | Institut des Hautes Études Scientifiques (IHÉS) |

Erscheinungsjahr | 2016 |

Sprache | Englisch |

#### Inhaltliche Metadaten

Fachgebiet | Mathematik |

Abstract | I will explore the notion of constraints on learning procedures, and discuss the impact that they can have on statistical precision. This is inspired by real-life concerns such as limits on time for computation, on reliability of observations, or communication between agents. I will show how these constraints can be shown to have a concrete cost on the statistical performance of these procedures, by describing several examples. |