We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

New and Evolving Roles of Shrinkage in Large-Scale Prediction and Inference (19w5188)

The Banff International Research Station will host the "New and Evolving Roles of Shrinkage in Large-Scale Prediction and Inference" workshop in Banff from April 7 to April 12, 2019. In recent years, statistical shrinkage methods have been widely used to answer science and policy questions in experiments and instances that involve inferences and predictions based on big-data sets. Current methodologies provide robust, scalable inferential techniques in solving complex problems in a wide range of application areas spanning biological sciences, health-care management, public policy, economics, finance and general operations. Motivated by these diverse applications, there has been a prolific growth in computationally scalable shrinkage methodology development, which has led to the usage of several innovative, popular shrinkage heuristics. Our theoretical understanding of the effectiveness of traditional shrinkage algorithms is built on some of the most elegant and seminal works in statistical decisions theory. In this avenue, several new theoretical phenomena and results characterizing the efficacy of shrinkage in modern large-scale applications and big-data regimes have recently been developed. It is an opportune time to consolidate and unify these recent progresses in the dual frontiers of applications and theory to build a robust inferential framework for the disciplined development and analysis of shrinkage methods that take into account the increasing complexities of the data. This workshop will provide a platform where researchers could exchange ideas and start collaboration on scientific projects and is intended to bring together the leaders in this field, representatives of application areas, and promising young researchers to charter the path for future development in the field. The Banff International Research Station for Mathematical Innovation and Discovery (BIRS) is a collaborative Canada-US-Mexico venture that provides an environment for creative interaction as well as the exchange of ideas, knowledge, and methods within the Mathematical Sciences, with related disciplines and with industry. The research station is located at The Banff Centre in Alberta and is supported by Canada's Natural Science and Engineering Research Council (NSERC), the U.S. National Science Foundation (NSF), Alberta's Advanced Education and Technology, and Mexico's Consejo Nacional de Ciencia y Tecnología (CONACYT).

20
2019
31
13 hours 3 minutes
20 results
Thumbnail
47:14
Maruyama, Yuzo
We consider estimation of a heteroscedastic multivariate normal mean. Under heteroscedasticity, estimators shrinking more on the coordinates with larger variances, seem desirable. However, they are not necessarily ordinary minimax. We show that such James-Stein type estimators can be ensemble minimax, minimax with respect to the ensemble risk, related to empirical Bayes perspective of Efron and Morris. This is a joint work with Larry Brown and Ed George.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
47:31
1Fan, Yingying
Characterizing the exact asymptotic distributions of high-dimensional eigenvectors for large structured random matrices poses important challenges yet can provide useful insights into a range of applications. To this end, in this paper we introduce a general framework of asymptotic theory of eigenvectors (ATE) for large structured symmetric random matrices with heterogeneous variances, and establish the asymptotic properties of the spiked eigenvectors and eigenvalues for the scenario of the generalized Wigner matrix noise, where the mean matrix is assumed to have the low-rank structure. Under some mild regularity conditions, we provide the asymptotic expansions for the spiked eigenvalues and show that they are asymptotically normal after some normalization. For the spiked eigenvectors, we establish novel asymptotic expansions for the general linear combination and further show that it is asymptotically normal after some normalization, where the weight vector can be arbitrary. We also provide a more general asymptotic theory for the spiked eigenvectors using the bilinear form. Simulation studies verify the validity of our new theoretical results. Our family of models encompasses many popularly used ones such as the stochastic block models with or without overlapping communities for network analysis and the topic models for text analysis, and our general theory can be exploited for statistical inference in these large-scale applications. This talk is based on joint works with Jianqing Fan, Xiao Han and Jinchi Lv.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
49:25
2George, Edward I.
Bayesian models are increasingly fit to large administrative data sets and then used to make individualized predictions. In particular­­, Medicare’s Hospital Compare webpage provides information to patients about specific hospital mortality rates for a heart attack or Acute Myocardial Infarction (AMI). Hospital Compare’s current predictions are based on a random-effects logit model with a random hospital indicator and patient risk factors. Except for the largest hospitals, these predictions are not individually checkable against data, because data from smaller hospitals are too limited. Before individualized Bayesian predictions, people derived general advice from empirical studies of many hospitals; e.g., prefer hospitals of type 1 to type 2 because the observed mortality rate is lower at type 1 hospitals. Here we calibrate these Bayesian recommendation systems by checking, out of sample, whether their predictions aggregate to give correct general advice derived from another sample. This process of calibrating individualized predictions against general empirical advice leads to substantial revisions in the Hospital Compare model for AMI mortality, revisions that hierarchically incorporate information about hospital volume, nursing staff, medical residents, and the hospital’s ability to perform cardiovascular procedures. And for the ultimate purpose of meaningful public reporting, predicted mortality rates must then be standardized to adjust for patient-mix variation across hospitals. Such standardization can be accomplished with counterfactual mortality predictions for any patient at any hospital. It is seen that indirect standardization, as currently used by Hospital Compare, fails to adequately control for differences in patient risk factors and systematically underestimates mortality rates at the low volume hospitals. As a viable alternative, we propose a full population direct standardization which yields correctly calibrated mortality rates devoid of patient-mix variation. (This is joint research with Veronika Rockova, Paul Rosenbaum, Ville Satopaa and Jeffrey Silber).
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
37:10
Yano, Keisuke
We discuss predictive density for Poisson sequence models under sparsity constraints. Sparsity in count data implies situations where there exists an overabundance of zeros or near-zero counts. We investigate the exact asymptotic minimax Kullback--Leibler risks in sparse and quasi-sparse Poisson sequence models. We also construct a class of Bayes predictive densities that attain exact asymptotic minimaxity without the knowledge of true sparsity level. Our construction involves the following techniques: (i) using spike-and-slab prior with an improper prior; (ii) calibrating the scaling of improper priors from the predictive viewpoint; (iii) plugging a convenient estimator into the hyperparameter. For application, we also discuss the performance of the proposed Bayes predictive densities in settings where current observations are missing completely at random. The simulation studies as well as applications to real data demonstrate the efficiency of the proposed Bayes predictive densities. This talk is based on the joint work with Fumiyasu Komaki (University of Tokyo) and Ryoya Kaneko (University of Tokyo).
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
34:03
Lopes, Miles
Spectral statistics play a central role in many multivariate testing problems. It is therefore of interest to approximate the distribution of functions of the eigenvalues of sample covariance matrices. Although bootstrap methods are an established approach to approximating the laws of spectral statistics in low-dimensional problems, these methods are relatively unexplored in the high-dimensional setting. The aim of this paper is to focus on linear spectral statistics as a class of prototype statistics for developing a new bootstrap in the high-dimensional setting — and we refer to this method as the "Spectral Bootstrap". In essence, the method originates from the parametric bootstrap, and is motivated by the notion that, in high dimensions, it is difficult to obtain a non-parametric approximation to the full data-generating distribution. From a practical standpoint, the method is easy to use, and allows the user to circumvent the difficulties of complex asymptotic formulas for linear spectral statistics. In addition to proving the consistency of the proposed method, we provide encouraging empirical results in a variety of settings. Lastly, and perhaps most interestingly, we show through simulations that the method can be applied successfully to statistics outside the class of linear spectral statistics, such as the largest sample eigenvalue and others.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
30:17
4Matsuda, Takeru
I talk about two recent studies on singular value shrinkage. 1. We develop singular value shrinkage priors for the mean matrix parameters in the matrix-variate normal model with known covariance matrices. Our priors are superharmonic and put more weight on matrices with smaller singular values. They are a natural generalization of the Stein prior. Bayes estimators and Bayesian predictive densities based on our priors are minimax and dominate those based on the uniform prior in finite samples. In particular, our priors work well when the true value of the parameter has low rank. 2. We develop an empirical Bayes (EB) algorithm for the matrix completion problems. The EB algorithm is motivated from the singular value shrinkage estimator for matrix means by Efron and Morris. Numerical results demonstrate that the EB algorithm attains at least comparable accuracy to existing algorithms for matrices not close to square and that it works particularly well when the rank is relatively large or the proportion of observed entries is small. Application to real data also shows the practical utility of the EB algorithm.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
29:44
Mondal, Debashis
In recent years, interest in spatial statistics has increased significantly. However, for large data sets, statistical computations for spatial models are a challenge, as it is extremely difficult to store a large covariance or an inverse covariance matrix, and compute its inverse, determinant or Cholesky decomposition. This talk will focus on spatial mixed models and discuss scalable matrix-free conditional samplings for their inference. The role of shrinkage in the estimation will be considered. Both Bayesian computations and frequentist method of inference will be considered. The work arose in collaboration with Somak Dutta at Iowa State University.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
40:26
1Zhao, Qingyuan
Since the invention of instrumental variable regression in 1928, its analysis has been predominately frequentist. In this talk we will explore whether Bayes or empirical Bayes may be more appropriate for this purpose. We will start with Mendelian randomization—-the usage of genetic variation as the instrument variable, and demonstrate how an empirical partially Bayes approach proposed by Lindsay (1985) is incredibly useful when there are many weak instruments. Selective shrinkage of the instrument strength estimates is crucial to improve the statistical efficiency. In a real application to estimate the causal effect of HDL cholesterol on heart disease, we find that the classical model with a homogeneous causal effect is not realistic. I will demonstrate evidence of this mechanistic heterogeneity and propose a Bayesian model/shrinkage prior to capture the heterogeneity. To conclude the talk, several other advantages of using (empirical) Bayes in instrumental variable regression will be discussed.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
31:31
3Pati, Debdeep
We show that any lower-dimensional marginal density obtained from truncating multivariate normal distributions to the positive orthant exhibits a mass-shifting phenomenon. Despite the truncated multivariate normal having a mode at the origin, the marginal density assigns increasingly small mass near the origin as the dimension increases. The phenomenon is accentuated as the correlation between the random variables increases; in particular we show that the univariate marginal assigns vanishingly small mass near zero as the dimension increases provided the correlation between any two variables is greater than 0.8. En-route, we develop precise comparison inequalities to estimate the probability near the origin under the marginal distribution of the truncated multivariate normal. This surprising behavior has serious repercussions in the context of Bayesian shape constrained estimation and inference, where the prior, in addition to having a full support, is required to assign a substantial probability near the origin to capture flat parts of the true function of interest. Without further modifications, we show that commonly used priors are not suitable for modeling flat regions and propose a novel alternative strategy based on shrinking the coordinates using a multiplicative scale parameter. The proposed shrinkage prior guards against the mass shifting phenomenon while retaining computational efficiency. This is joint work with Shuang Zhou, Pallavi Ray and Anirban Bhattacharya.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
42:45
Li, Jialiang
The Fama–French three factor models are commonly used in the description of asset returns in finance. Statistically speaking, the Fama–French three factor models imply that the return of an asset can be accounted for directly by the Fama–French three factors, i.e. market, size and value factor, through a linear function. A natural question is: would some kind of transformed Fama–French three factors work better? If so, what kind of transformation should be imposed on each factor in order to make the transformed three factors better account for asset returns? In this paper, we are going to address these questions through nonparametric modelling. We propose a data driven approach to construct the transformation for each factor concerned. A generalised maximum likelihood ratio based hypothesis test is also proposed to test whether transformations on the Fama–French three factors are needed for a given data set. Asymptotic properties are established to justify the proposed methods. Extensive simulation studies are conducted to show how the proposed methods perform with finite sample size. Finally, we apply the proposed methods to a real data set, which leads to some interesting findings.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
40:38
1Bhattacharya, Anirban
Markov chain Monte Carlo (MCMC) algorithms are commonly used to fit complex hierarchical models to data. In this talk, we shall discuss some recent efforts to scale up Bayesian computation in high-dimensional and shape-constrained regression problems. The common underlying theme is to perturb the transition kernel of an exact MCMC algorithm to ease the computational cost per step while maintaining accuracy. The effects of such approximations are studied theoretically, and new algorithms are developed for the horseshoe prior and constrained Gaussian process priors in various applications.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
40:27
3Guntuboyina, Aditya
I will start by presenting some Hellinger accuracy results for the Nonparametric Maximum Likelihood Estimator (NPMLE) for Gaussian location mixture densities. I will then present two applications of the NPMLE: (a) empirical Bayes estimation of multivariate normal means, and (b) a multiple hypothesis testing problem involving univariate normal means. I will also talk about an extension to the mixture of linear regressions model. This is based on joint work with several collaborators who will be mentioned during the talk.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
40:34
Xu, Xinyi
Bayes factor is a widely used tool for Bayesian hypothesis testing and model comparison. However, it can be greatly affected by the prior elicitation for the model parameters. When the prior information is weak, people often use proper priors with large variances, but Bayes factors under convenient diffuse priors can be very sensitive to the arbitrary diffuseness of the priors. In this work, we propose an innovative method called calibrated Bayes factor, which uses training samples to calibrate the prior distributions, so that they reach a certain concentration level before we compute Bayes factors. This method provides reliable and robust model preferences under various true models. It makes no assumption on model forms (parametric or nonparametric) or on the integrability of priors (proper or improper), so is applicable in a large variety of model comparison problems.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
33:33
Mukherjee, Gourab
We develop a novel shrinkage rule for prediction in a high-dimensional non-exchangeable hierarchical Gaussian model with an unknown spiked covariance structure. We propose a family of commutative priors for the mean parameter, governed by a power hyper-parameter, which encompasses from perfect independence to highly dependent scenarios. Corresponding to popular loss functions such as quadratic, generalized absolute, and linex losses, these prior models induce a wide class of shrinkage predictors that involve quadratic forms of smooth functions of the unknown covariance. By using uniformly consistent estimators of these quadratic forms, we propose an efficient procedure for evaluating these predictors which outperforms factor model based direct plug-in approaches. We further improve our predictors by introspecting possible reduction in their variability through a novel coordinate-wise shrinkage policy that only uses covariance level information and can be adaptively tuned using the sample eigen structure. We extend our methodology to aggregation based prescriptive analysis of generic multidimensional linear functionals of the predictors that arise in many contemporary applications involving forecasting decisions on portfolios or combined predictions from dis-aggregative level data. We propose an easy-to-implement functional substitution method for predicting linearly aggregative targets and establish asymptotic optimality of our proposed procedure. We present simulation experiments as well as real data examples illustrating the efficacy of the proposed method. This is joint work with Trambak Banerjee and Debashis Paul.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
23:31
Komaki, Fumiyasu
A class of improper priors for nonhomogeneous Poisson intensity functions is proposed. The priors in the class have shrinkage properties. The nonparametric Bayesian predictive densities based on the shrinkage priors have reasonable properties, although improper priors have not been widely used for nonparametric Bayesian inference. In particular, the nonparametric Bayesian predictive densities are admissible under the Kullback-Leibler loss.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
37:04
2Marchand, Éric
This talk will address the estimation of predictive densities and their efficiency as measured by frequentist risk. For Kullback-Leibler, α−divergence, L1 and L2 losses, we review several recent findings that bring into play improvements by scale expansion, as well as duality relationships with point estimation and point prediction problems. A range of models is studied and include multivariate normal with both known and unknown covariance structure, scale mixture of normals, Gamma, as well as models with restrictions on the parameter space.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
53:41
11Koenker, Roger
Single index linear models for binary response with random coefficients have been extensively employed in many settings under various parametric specifications of the distribution of the random coefficients. Nonparametric maximum likelihood estimation (NPMLE) as proposed by Kiefer and Wolfowitz (1956) in contrast, has received less attention in applied work due primarily to computational difficulties. We propose a new approach to computation of NPMLEs for binary response models that significantly increase their computational tractability thereby facilitating greater flexibility in applications. Our approach, which relies on recent developments involving the geometry of hyperplane arrangements by Rada and Černý (2018), is contrasted with the recently proposed deconvolution method of Gautier and Kitamura (2013).
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
35:19
Strawderman, William
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
46:07
2Ahmed, Ejaz S.
Nowadays a large amount of data is available, and the need for novel statistical strategies to analyze such data sets is pressing. This talk focuses on the development of statistical and computational strategies for a sparse regression model in the presence of mixed signals. The existing estimation methods have often ignored contributions from weak signals. However, in reality, many predictors altogether provide useful information for prediction, although the amount of such useful information in a single predictor might be modest. The search for such signals, sometimes called networks or pathways, is for instance an important topic for those working on personalized medicine. We discuss a new “post selection shrinkage estimation strategy” that takes into account the joint impact of both strong and weak signals to improve the prediction accuracy, and opens pathways for further research in such scenarios.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery
Thumbnail
42:46
1Ghosh, Malay
We consider sparse Bayesian estimation in the classical multivariate linear regression model with p regressors and q response variables. In univariate Bayesian linear regression with a single response y, shrinkage priors which can be expressed as scale-mixtures of normal densities are a popular approach for obtaining sparse estimates of the coefficients. In this paper, we extend the use of these priors to the multivariate case to estimate a p times q coefficients matrix B. Our method can be used for any sample size n and any dimension p. Moreover, we show that the posterior distribution can consistently estimate B even when p grows at nearly exponential rate with the sample size n. Concentration inequalities are proved and our results are illustrated through simulation and data analysis.
2019Banff International Research Station (BIRS) for Mathematical Innovation and Discovery