High dimensional statistical inference and random matrices
270 views
Formal Metadata
Title 
High dimensional statistical inference and random matrices

Title of Series  
Number of Parts 
33

Author 

License 
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
Identifiers 

Publisher 
Instituto de Ciencias Matemáticas (ICMAT)

Release Date 
2006

Language 
English

Content Metadata
Subject Area  
Abstract 
Multivariate statistical analysis is concerned with observations on several variables which are thought to possess some degree of interdependence. Driven by problems in genetics and the social sciences, it first flowered in the earlier half of the last century. Subsequently, random matrix theory (RMT) developed, initially within physics, and more recently widely in mathematics. While some of the central objects of study in RMT are identical to those of multivariate statistics, statistical theory was slow to exploit the connection. However, with vast data collection ever more common, data sets now often have as many or more variables than the number of individuals observed. In such contexts, the techniques and results of RMT have much to offer multivariate statistics. The talk reviews some of the progress to date.

Keywords 
canonical correlations
eigenvector estimation
largest eigenvalue
principal components analysis
random matrix theory
Wishart distribution
TracyWidom distribution

00:00
Computer animation
12:14
Computer animation
23:34
Computer animation
Meeting/Interview
35:19
Computer animation
49:20
Computer animation
00:06
a toe pick it is for me or Nora and a great pleasure to introduce the varied speaker Yang Johnstone a Johnstone studied at Australian National University here is that they did agree in 1981 from Cornell University professor in the department of statistics at Stanford University and he is also a jarring appointment India's at least at the medical school he is a member of the National Academy of Sciences of the United States of America Johnstone water may await your regular statistics more precisely In statistical chancery a number financial estimation has successfully applied variety of mathematical techniques in statistical problems are let me mention for instance the usual way flights in Funchal nation eyes the connection between the theory of random of Medicis back these at that became her of March go and medicines new statistics that Dole die double feast highdimensional cities too cold In enron them Medicis engine to a toe so I thank you and thanks to the Organizing Committee for inviting to statisticians To address this Congress never stop with a few words orientation hand forgive me if I might reduced statistics is a mathematical science and yet it is often quite jealous of its independence has been a long and pep still ongoing debate over the the appropriate role of mathematics and statistics and the young massive growth of computing continually changes the terms of this debate however most of most would agree that 1 major goal a mathematics and statistics is to contribute understanding and results that users of statistics find helpful and by users of statistics I mean people across the physical social biological sciences engineering promise and not just a curator Kurdish course along traditional justice kind of influence back to the earliest days of probability statistics before but In today's world of nearly unlimited computing was goal of mathematics for statistical understanding remains a vital 1 hour and a half tax is indeed more vital as we tried organizer lawyer from haphazard computational discovery so much but current research both statistical theory and Mary many areas of applications including genomics technologies astronomy focuses on the problems and the opportunities posed by the availability of large amounts of data you might have made variables and many observations on each variable so loosely you think each variable as being an additional dimension and so many variables corresponds to to sitting in IT highdimensional space so even Evangelia among several mathematical things that 1 would follow up backspace serial comics geometry or even perhaps topology Bible pick 1 namely animated series and the use its interactions with some important areas of what statistics goes by the name of more analysis so might outline our you will hear many echoes of ideas from stroll and there's actually a good reason but I'd like to acknowledge Mulford statistical analysis of field and statisticians of along the lack of information on fluctuation of extreme example that but it was a talk of Plessey several years ago on the year Dave Johnson theory that sparked the idea that random matrix theory could provide some answers police classical mulled over statistics question and the goal of this tall tell some of that ensuing story but so what is moderate analysis it deals with observations on more than 1 variable way is all might be about some dependents among America's the most basic phenomenon is that a correlation told parents tend ahead children are that sort of thing but but from the beginning has been interest reduction of dementia now also saw a rough I'll be summarized in interpreting data throughout techniques that reduce the number of variables and a basic example of that is Ford's proponents and while a reality based I want use that as an organizing of and together with a few remarks on related problems so the 1st few topics will be argued auditor said definitions of background and they are the yet more recent work in later sections of the whole world Our concern what happens when 1 takes a point of view that is completely up intrinsic surrender theory but but not so common statistics 9 of allowing the number of dimensions about to grow 0 checked so let me start by saying what principal components analysis is I assume I should give us the editorship discussion and an example of this is traditionally described in terms of thought possible in terms of random variables and a baby what 1 does actual data some 1 images that you have a set of variables he might be quite large and pretty much what 1 has to assume about this Clark is that there is a covariance matrix namely my collection of crossed product moments between the case a prime variable and there is no 1 wants to somehow juice effective dimensionality among does this by forming the rhyme variables linear combinations of of the year original variables and those variables rivals have which is just a quadratic form in In combines matrix and then the largest to try and pick about directions load derived variables but but but suck up most of the variants as much parents as possible up from you from the original collection of so that amounts to an organ maximization problem in 1 week in which 1 looks for the directions are successively maximizing this quality and after you find the 1st principal proponent of eigenvalue corresponding eigenvectors venue reprieve repeat the process up constraining the subsequent sectors to be orthogonal Of The year of the preceding art and body maximizing the parents this way 1 hopes that most of the variants of apparent the could be summarized in relatively small number of the post but ah principal from so a standard crime statistics is that you have a model such as I just described but that they have rights matrix I'm particularly principal component quantities are on not instead you have some data noisy Ltd he had been observations on each of the variables I 1 thinks of each of these and observations is being organized in a column that the Place column the cake rebel these are collected throughout an invite data matrix as the convention is going to be about happy variables our indexing columns and then maybe the road stops me indexing the observations on on the am replication so all users celebrated examples from from my human genetics certainly like idealized from the original description I feel variables Our also live that context but in a couple slides the data table hero our gene frequencies of various Ali deals of different genes and their of ODA tunes and the goal of the analysis was to try to summarise this a small number of genes synthetic genes if you like abide by principle production and the replications here represents about different locations here here with sites in Europe are at which the gene frequencies of the youth of the local populations would work were measured so you get this dynamite 6 40 by the 400 by 38 and
09:57
there is a standard preprocessing step subtract lead the means of the problem is so I will assume that's always so you subtract the that you have a means sensitive data matrix and now the that's a project for the talk resulting sample current matrix where we now sample occurrences between appears veritable by taking products summer notably the 400 observations about arriving at this Frost products matrix normalized so so now we're doing exactly the same sequence of observations almost sample matrix that With but I described for population variants up we look for combinations of the column was derived variables and look at their sample variances now but erratic form in a sample a matrix looks for the directions of maximum variance successively orthogonal the previous directions these produced sample principal components eigenvalues values and sampled principle of eigenvectors nite motorcycle taxi on days that the statisticians convention for 4 is might quality sample quality so is the conventional picture of what's happening with principal components which reduce the use sample items values vectors and ineffectual reconstructing is a rotation from the original representation where our Our 400 or any observations at each point is representative of the pointing in threedimensional space with constructed a rotational with I With a sample eigenvectors at maximizing appearance in the in the 1st quarter and perhaps the 2nd the cops immediate we would throw away the 2nd arable and
12:02
use the most credible as the reduced demand summer so in our rush genetics later with 38 variables the most durable Our
12:14
explains 27 per cent of the variation in the center of the 1st byte values 24 7 % of lady of the sum of all we are values 2nd 1 explains 18 per cent of the level and this is a fairly common situation said the 1st free sample eigenvalues capture In fact wore off the yet the total variability our Adams course that leads to a natural questioned how many of these a significant way to chop off and I would say that there is still not really on about widely accepted mathematically based out book but decided it's generally heuristic forgot OK so in the case of the Yucca Valley summer not say the answer they sound file we we can construct please sample quits compartments was sampled the right variables but the observations had occasion and so we can actually construct contour plots taking the 1st riot variable and father nears as a function of the location of each observation and then Mike Yukon so From what from large values too small and of course costly are very impressive feature is that this is now a year write to the progression from Asia Minor about northwest Europe and want do similarly other principal components Nikon moment say why this is such a an impressive example it'd be not our from archaeological excavations that but farming over the period from about 9 thousand years ago suffice thousand years ago had disused gradually from From as you might have heard throughout Britain and Scandinavia us this was archaeological not force a gene frequencies are genetic data from from contemporary contemporary population and so when we corresponding principal component that summarizes the information from the James was and looked practically identical so the yet to the archeological information about this workers this was taken as pretty strong evidence In support of resolving a basic question which had been well bombing clearly diffuse was the farmers themselves and onwards the population's migrate or was it the technology just about the spread leaving the populations in place and this was taken as strong support for actual migration up and similar interpretations in terms of linguistic properties on the migration behavior were made for for subsequent the the 4th 2nd 4th principal components but that's a story that wanted really referred the the books of the Bali spots so I'm not sure it's bad example it's really to give a significant example of the use of its proponents what will carry forward from the example is the yard dichotomy between a population model in where we have these variables and on current might facilities and corresponding principal component values of vectors and we have our being observations about noisy limits of variable Our which form a sample data matrix and we have a corresponding sample quantities and somewhere naturally you're interested in how well our sample quality about informs us about the population 1 both they fight in there values and the eigenvectors so so with that general background and none of that required a probable see model we got great detail but if we want policy statements about week to introduce probabilistic assumptions let me review some some basic definitions that are so the most basic odd model will be that will take care of the parent Dow data that shows our news that notation because an individual observation is going to be a rock all people want nuts and the gap multitude about distribution is characterized by me my trips and we're going to about take me in draw was postulate in draws from this small dousing distribution and Of course this is Onlea model and a common sight in our business the models are wrong but but some useful and there's not much doubt we gassy model qualifies as useful so collect use the and observations and put them into an an end what column matrix from now on we will be assuming that these era and we wish it matrix which distribution is formed by taking the online sample of trips that France but that this is said to have it was the distribution with favorites subscript and in quote degrees of freedom for a
17:54
population of my out and lead the density function for the for the wish it matrix was derived by John which working under the guidance of us Ross Fisher ruffles experimental stations in 1928 and that's basically the beginning of moderate statistics for distribution vary so we wish its distribution and which it matrices connects principal pirates Alcester simply threw up for a fight opposition so we stop without dousing later formed the wish of matrix and then take spectrally opposition we would probably be sample opponents and a sample eigenvectors some reviewed by took set talk but just amount about 1 Ala statistical procedure because it helped set the scope of the year the results and that is Sokol canonical correlation which also was introduced Howell for telling components was introduced by hotel as well ah but now we have 2 sets of bearable I'm expiry bowls and you and laughter what for a probabilistic model will assume the bearer jointly by dousing distributed plus 2 dimensions and the name of the sky is defined back like a combination of the expert rebels that is as as correlated as possible with some car Lee accommodation the wider climate Audi which is common the Michael correlations in my hand up sea surface temperature observations Over us some part of the old motions after the expiry and the wider variables might be land temperature observations at various locations of the North American continent for example and you try and find that the index all up temperatures that is most tightly correlated with some index falls out a planned temperatures and so I again you have the school of populations and then when you have samples in samples on on each of the sets of rebels banged up by going through the sample of the procedure you a generalized problem which depends now 2 independent which might so that's the reason for introducing the examples that there are now about to independent wishes both on me rebels but with different numbers of degrees of freedom and offered general dousing distributions get various general parameters office here but for most of the talk I want concentrate on a situation where these are actually the identity so Michael correlations is emblematic of what so what call we thought we should things which In the simplest case about local offices and all explain why you know little about Iraq 2 independent wishes but on rebels art with anyone and 2 degrees of freedom respectively the yet the single which it essentially consigned with by letting me into become lodged in the limits and Our weary where interest in the the roots bank of generalized Icahn problem for for everybody and very remarkable thing is that if you look at what's any textbook Eva the radical or methodological is a classical moderate analysis just about every chance such a book describes it take peek at its involves Unicom problem that is either the double which should item problem or single wishes because so I while I would say thing about the analysis analysis of variance or moderate regression discriminant analysis will image scaling factor analysis like all our reduced to spectral problems of list of this flight and it's not surprising that getting information about the year distribution of these routes in samples that has long been a topic of great interest and so 1939 it's remarkable that set up the joint basic functions of the year for the U. routes a bubble which set him up and single which framed up where the derived In the null situation with identity which simultaneously buying life fired the year of the most distinguished moderate analysts of the at that time and 3 different continents and the former of joint density is of course of interest to us are there is this products Of the variables taken 1 of time in which we see a white functions in the in the Place Our white punches from from classical orthogonal polynomials and then there is this Jacoby arising
23:36
from the transformation from from they observed data quite value but the what off and this this Jacoby courses all sorts of trouble for the distribution theory starkly so this was the
23:53
beginning of a large and rich Mulkerin analysis In the United States in the area but study fire it became a rather complicated series and actually it has to be said it lost a lot of influence on on applications so I'm gonna take a detour around that time and now briefly indicate where random matrices
24:18
enter look at Wednesday here are a couple of things that was said in more detail proceeded on a couple Our and so that in many service Florida atomic nuclei be energy levels our model was eigenvalues of the Hamiltonian operator and while a load line imaging levels theoretical analysis possible at a higher energy levels things are remains complicated and so we know have our proposal that 1 should instead sees the physical descriptions and furthermore 1 could obtain some of those statistical descriptions by by modeling Beyond that as as those of a suitably symmetric lodged random matrix at I laughter so 1 of the class celebrated examples of such a statistical summary but will have cause to refer back to is please send so circle so the for last symmetric about random matrix with independent and identically distributed entries 1 forms the year they can find the values informed the empirical distribution function of the eigenvalues values band in the watch and limits the you'd be distribution approaches they sell said send a circle of war now we could and In presidents and richest families talk about doused in ensembles in which the Matrix entries were were us what Garcia and the yacht corresponding white function in the eigenvalue distributions work was now saying it was fairly early on recognize that it would be interesting to look at how the White functions such as those from the classical orthogonal polynomials your families and Of course I will but wasn't completely realized at the time that that yields precisely the idea we wish of the bubble which should eigenvalue distributions about as I've argued was central to toe what most of statistics our analogies with statistical mechanics maybe natural to introduce this In those temperature parameter beta as well so that while the year statistical setting leader was 1 us want introduced about you more generally so Dyson buys a classic paper show that natural cemetery assumptions meant that the young lead our values would actually Cohen is my models I would want to and 4 and the what about these corresponded to situations in which the United centuries work Real conflicts its own and Of course we have 1 of these situations the the comes up often in which b situation of greatest interest application is of course but that real data about mathematically more tractable situation is writers for matrices carpets and all but 1 should say the complex data models and are the right ones some things especially communications and and some metrological said so summarize lead 5 With classical orthogonal white functions and the use a particularly the you Madrid rules 1 orthogonal on articles to unitary faces up we we we recover the classical distributions of of In the hypothesis of water on nor structure phase of moderate statistics so I this wider summarizing us some of the U. classical types of questions that have been asked about on In my experience some of the corresponding uses the start appearing in statistics and I'd want talk about it some old ladies I am particularly as being a lot of work very refreshes literature
29:01
around us people black wireless systems using results for the for the bulk about distribution its statistics the newest of the brightest sort of new infusion of information has come from from us information about extremely odd about she saw all concentrate on Batam say a little about eigenvectors appear so a couple of remarks about cites a vessel products which were doing because the exact distributions are typically too complicated as models are products of the time and so in 50 about it in statistics it was really not very natural to think of taking a large number of variables lead the bite but that I would have typically have a small number of variables and the natural asymptotic regime was bitter about imagine large sample sizes and has a lot stake in the lead standard books and some of what do it on the other hand Of course right at the heart of pragmatism is the notion that you have a large number of variables and that is that now more attractive for us and statistics because of because of the nature of contempt readers a find In matching up the parameters of this an interesting side remark that that and opposed tonight so if you if you match the about the wish of density with it Libya ensemble in a matchup of the number of variables With capital in and look at the Yale sample size miners variables unprepared the alpha parameter your led to to remark the so In statistics there's no necessary relationship between between the sample size and the number of variables and Alpha typically a model where 1 treats both here and in a slot Apple will also be lots while that might not be so natural in the 1st 3 months of a prank matrix models of this kind optional asymptotic Sears is what what 1 would lead to a statistical results so let me give a concrete example of I value repulsion in which in statistics was being honest spreading sample items that this is a pretty extreme case so imagine that you have us In 10 10 samples from a 10 dimensional dousing distribution and you you know that the cover covers matrix is the identity of the population of the matrix Is the identities of the old population eigenvalues of what Richard draw sample and in a sample the sample eigenvalues barrier before your orders of magnitude so that if you didn't have supporting theory to tell you what was going on you'd probably be led to Mike's some pretty false inferences about the nature of the population values when you see a spread of the 3 orders of magnitude so the description of this is given by the covariance matrix version of the semicircle but mentioned a month ago and this is the year much circle which is about more generally them but I'm describing it but here with us I don't think we should have a right an explosive formula so a guy distribution function for the sample items that is a sample principal component parents life is is obtained by counting how many common sample eigenvalues fall below fixed the city and in Lady in the limit as the number of rebels increases 1 gets this this quartercircle form so the bloom is an example for when the number of is a quarter but the sample size is so the upper it it is a quarter and a spread from from that point 2 5 2 . 2 5 and then it is also an agonizingly spreading example Monica then you get a rangers support from 0 up foreign and so we were so units that some of this in a sample on a previous life so well estimates a description of the behavior of the use of the bulk of the item not what primarily focus on on information about the largest text in the next couple topics sowed I'm let me say a little bit about the lingo hypothesis texts our so supposed that week but observed I'm a set of sample which funds misses in this again in the 2nd of 10 we see that a lot just 1 0 . 2 5 and we want is that consistent with with the data having come from identity which or Is there is a lot of the largest population value actually want on notice that while leading the limiting supporters
34:57
of for that doesn't rule out saying something slightly larger than 4 in a sample of the yet we might trigger for so little settle for so that the kind of phonology here is that the null hypothesis what would be a simple model with our structure and we would like the we would like to live with
35:20
simple model unless forced by the data to reject that inside robot of a more tentative hypothesis of which makes about no such restriction on on the current site so if you would just for a few samples from from identity which my trips and look at largest eigenvalues you'd see this kind of fluctuations in the simulations are that would really what this was consistent with with the use or not so in fact watching me the actual distribution of the largest sample about on the list of identity about which model not like Publicis which it not and so that's where the yes the Tracy wouldn't distribution we referred heard about both which family and acidic told comes Constable and so it old Ladies classic moderate statistics all hypothesis said under the assumption of identity reverence we can say that about whether the data was real or complex whether it's a single worship or bubble which as long as the number of rebels it comparable to the sample size unite the single bubble situations that after appropriate sentiment spelling the distribution of sample items that is is given by about 1 of these Tracy with most with Our but would be to bring about want the real case of in the complex and thus as we say in a couple times they have beautiful expressions of poems Of solution huge to wrap up a particular for instance of the 2nd half of the equation nonlinear 2nd 2nd order the French look up from point of view of us this is not so easy to compute numerically but now there is good There good pieces of code available and From the user's point of view this is actually no different than using a computer package with normal was so it's publicly available for everyday use the yard narrow 1 is the favorite to be the more spread out the victory so up so we have available now description of the place distribution as the limiting approximation I want to focus on on 2nd order accuracy of forbids which might seem like a technical In addition but but in fact if I was trying to convince the applying statistical world the applications community of Of the relevance of a this truly new set of distribution functions in establishing that the the approximations Our reviews when he is not ovary Lodge is is actually likely will be of significant help in in my head and so it turns out the body by making relatively small modifications appropriately chosen modifications of the use of the century and scaling parameters In In this limiting distribution that lead up but they of approximation which would normally up 1st order of minus 1 3rd year and the year and be improved to minus and rather more concrete way you want to compare this approximation to exact tabulations and on show a slight bubble which success where nobody sent instead somewhat different formula because there is a fairly elaborate set of tables the computer up by William Chang the U.S. taxation office people are interested is on and so 5 in the bubble which situation we have to remember number of variables they meant to degrees of freedom plan 1 and and this this practice comparing our Tracy with some approximation of lead so the exact values computer in a table over the conceivably interesting range of values of the year and 1 from into prime covered in this in this way so this is essentially the want clamored for increasing the Sicilian through increasing plots and you could say that but what they thought of as the 95th percentile lets a standard point in the distribution of useful statistical also says anger and the approximation is clearly good enough for friend applied purpose in assessing they be was associated with phenolic also fiber art so so these distributions really our relevant and useful I'm just a couple of remarks on our comparing Lewis With 5 traditional extreme value theory about why things so different here 5 string value theory that's an elaborate series of the maximum a beam of independent random samples and in such a setting its natural look at products products of individual events on on individual components but because of the about the repulsion may be strong interdependence Beyond that that approaches us out of the window it is not in the sample eigenvalue situation and instead about this summer Our this Jacoby the other week saying before Our has of some mental representations spend want some of relations that the choice that the Thomas really they got about the release of the key told here so instead of working with the product Our instead the inclusion explosion elections forces in this direction that leads to the formulas as as also saying pursues told off and I'm so so the there's a determined formula for the use of a lot spike in value in the complex case in terms of this
42:21
correlation tunnel and so the year want expresses some of these classical orthogonal polynomials our anger and proceed with analysis and the the situation is a little more complicated in the real Baker case about bank but still amenable to analysis said as an antidote Towards analysis this outside remarked recently there's been a probabilistic a truly probabilistic our characterization of the Tracy wouldn't distribution hosted by by our Ramirez a writer and Iraq connects it 2 left to like that usable bit a diffusion operator so well so there really are also purchased well so it's come back this little toy chest the example we we sold for . 2 5 and now we have this approximation with the plugging Apple compute the approximate chances if not like publicist no structure was was corrects are they in fact there would be a 6 percent chance of saying for . 2 5 0 larger and so in fact conventional problem so we would not be particularly persuaded that this number was up was due to end a thing out of the ordinary charts population from from the identity of a small that of course begs a question which is well if it really was larger than 1 in the population you detected and so on that portion directly into questions the power of the largest eigenvalue text and so call what what what could we say about distribution on the road a more general assumption about parents and that Texas of the next topic and an out of the year the most traditional the might of classical ensemble so so the classical ensembles correspond the use every symmetric situation with structure and in some sense from a statistical perspective this is Bailey scratching the us because we're really interested in in the asymmetric situations for example in order to get of our to construct interval estimates competent suitable for Will the largest item value some interval with high probability problems they be value art and many specific applications it's it's quite clear that the on the line cover might have I have specific structure but is covered by by small so they could just be I'm value distributions in these more general situations but for the bubble which and single which he has found here have been derived for example body by Allen James and it and had expressions and so but Piper geometric functions of of its arguments and so actually we really need all of serviceable approximations the breeze as well and because they geometric functions notoriously hard toward so let me tell you a little bit about some recent results on on this question for a principal components analysis so we grasp on what conditions on sample of it on the population of Iran's matrix is isn't the case that we still have a tracing limit for the largest eigenvalue larger sample items valued perhaps allowing the central scaling constants to depend only on the the general that sick so I'd say we don't have a complete theory about some some interesting answers career has done its sufficiently many of the population eigenvalues accumulate knew the largest population value when you still have Tracy with and they I'm in an example of all say more about the moment are a separate in which you have a few isolated but not too isolated population that argued that is you still have traced lead results of a much more complete so far violently in the complex and I just want know the that's because of this celebrated the matrix identity in terms of moments that is up while not for new treaty integrals 5 and of course like what's above areas of application we would love to have some for book from real real as well a price this is the use of the situation where there are up a small number fixed our population values that that could be different from 1 but that number remaining 6 that's a finite rank perturbation as the number of rebels the sample size growing some portion and some of about trial discussion what we already look back as the identity face the Convention on the user's rate the population values of the role stack up 1 him a blue is of a sample of the sample the density of the sample Our all I values is given by the magic passed quartercircle law and the largest sample items value as we say in hazardous Tracy with fluctuations up and I didn't emphasize the fact that it's a bit of a mind the order fluctuations around the Opel limit of the year much of the past support now if we push 1 of these guys Alfia be much so be it arbitrarily large we would expect that if it's very well said that the larger sample items that you should have about traditional Nelson fluctuations of Waterbury Vermont and that's In fact pretty easy to established by bystander perturbation question is what happens in the 20 and it in various ways some answers of being assembled by bike Andrews the shit hole bite so was signed and I'm really trying to show off my mind poor persons of attempted a movie to show but but there is a very
49:24
interesting transition at an interior point within the lead the support of the the United Parcel a sort of if if voices value population still why don't we we're sitting here With Tracy would have nothing changes as you move up awards the theory up the support the critical point from the
49:47
left about the Tracy would distribution reminds correct and it doesn't move in patient something very interesting happens exactly transition which I want talk about minister jump above Our now argue that the in for tuition and it out there so the L 1 progressed but notice but this is still within the within the ball of the match past support so even those leading was well inside the support they need sample items valued pops out as discussed in and why this transitional should be exactly 1 us would damage the square root of the you offer
50:40
limited support it is and I don't we have a heuristic argument that although brewery has some interesting connections with the store transformed arms of think about them so if somebody we have about 1 1 regime to the left of this critical point and about Gaston regime of the rod so I'd like to give an example part of this comes from economics that's interesting because it helps explain a previously mysterious phenomena at and the sex selection spurred 40 but fought application of this as I by transition that was just identified the last year or so and so this has the Dewey factor analysis models baskets of stocks or securities factor analysis is back in vogue apparently with the rise of Ben Bernanke the Federal Reserve Chairman and tell us or factor analysis model tries to represent me on the back of a collection of securities in terms of a small number of underlying factors and individual securities specific fluctuations I am so me the traditional idea had been 1 might be able to represent the unique collection of a hundred or 200 security prices in terms of a small number of factors and that it should be possible to identify these factors behind by puts now this this example was actually a negative example for principal but tough on should be interested in the limits of techniques as well so if not not widely often cited paper Stephen Brown set some simulations of a full factor model that simulates a calibrated to be matched to us but your stock exchange that answer this fact the model leads 4 population values Our above a young an underlying idiosyncratic noise levels and the I was want to be out West we respect the princess from my principal and the empirical observation from these simulations that will regarded as a possible was the Our exact you can't pull out please these recently eigenvalues from from from mini sample item that is small and the we get selection from from Matthew Harding at MIT economics this year is that it is precisely because In in exactly the simulation these eigenvalues libel by strands solstice showed pictures but so and typical models news but supple market factories is very big but the motion was the next story should be distinguishable from the from the party line noise level was the Of the item values against them remained numbers of securities In late in the collection but in the simulation that so there are 2 things here this about what what sings about the largest largest sample eigenvalue and these are banned representing what happens in stimulus repetitious of simulation from a 25th above 70 per cent off a new students biased notably about population that but then the more surprising thing was well lead the lead sample Eigen values and many of them and they're all y above the Earth the 2nd through 4th population quite values people thought what's severe ought to be identified and so were so there is lots of text in brown paper about the lack of knowledge about the sampling properties of Agon values of a breeze in samples and so I actually it's a completely straightforward application of this phase transition to recognize that this a 2nd from 4 values below the phase transition boundary off and of course the use please sample eigenvalues just aren't up against the much the past bewilderment and then the theory also said is exactly what the boss should be in a sample of top item that you as a part of the population of the black is the prediction from the theory are for what you for what 1 would see sample as opposed to the population so at us so the bicyclic complete much OK so under finish with some very brief remarks on estimating eigenvectors in principal components analysis not because we have in a thing like a complete story about but rather because it's just clearly important a meeting of our attention so so but suppose that were in the yard which matrix set Our but will assume the sample of the population of as obvious finite Frank perturbations of identity structure so there are a fixed number of population eigenvectors here but would like to estimate and learn about art and in very traditional us so asymptotic models where the where the size a number of variables fixed on sample size grows there a standard it remains of gassy and limit theory but the interesting and significant phenomena is that is of this completely collapse of 30 member variables it is proportional to the sample size and so us fat it was strength Of The year of these eigenvectors lies below the phase transition that I was describing the slides ago they knew nothing because the will sample is effectively wheels fall also the truth are not at and above that transition well you start but we have no fraud approaching our 1 but not in those situations a finite pressure is Is there right consists of estimate of a sensor to actually much more correctly adding this result has emerged a couple of ways this statistical physics learning theory literature that has the right through up non rigorous replica methods from statistical mechanics some of them more recently it's been rigorously established by fire Eigen writing about fighting the perturbation house so this phenomenon news on the stored in the sort out rhythmic signal processing literature with when you're dealing with large images with lots of pixels on so for us and it's been incorrectly understood that that 1 should represent later an appropriate basis and select the small number of features before you we start think about doing principal components analysis but we don't really have a theory that that the matches that year although it sort of clear what the outlines of such a theory could could be involved I'm so awesome awesome results by cold weather getting started so what 1 might want to do is assumed that that In some orthogonal basis the population victims Have a sparse representation in the that a relatively small number of components want and I'm How about a decade it is it newly determined by by Prime and they we can add to land a simple observation but so if they sample items factor for example a single component model can be given radio signal was always kind of representation so if you look at the the compound of the sample eigenvectors is orthogonal to the of the truth population that they them that our continent is is in fact uniform on on appropriate sphere so you don't say the remark because X Factor says that an isotropic gas in bodies essentially Austria and you could get around and so it will essentially be have bounced noise and so this converts an eigenvectors estimation problem into our it's much about mean that so we're going from cover insist much mean much and in the last decade we collectively liable walkabout about it's the measurements of spots mean that the answer it's quite likely that 1 will be able to get quite shot up from lower bounds for Minimax quantities of capture how well you rest making Mr. collection by eigenvectors so that's it I put a few bland Blank watch remarks there and thank you for your attention