Large-scale Inference and the NEST Estimator

Luella Fu, PhD, Assistant Professor, San Francisco State University

Large-scale inference covers a set of statistical methodologies that address problems of detecting and estimating significant data amongst thousands or more points.  We start with an overview of four key questions for large-scale inference with examples from modern-day applications.  In the context of these questions, we focus on the problem of assessing school quality from standardized test scores.  Essentially, we want to estimate a vector of normal means with heteroscedastic variances.  We propose the "Nonparametric Empirical Bayes SURE Tweedie's" (NEST) estimator, which estimates the marginal density of the data using a smoothing kernel that weights observations according to their distance from other data based on both observed test scores and their standard deviation.  NEST then applies the estimated density to a generalized version of Tweedie's formula to estimate the corresponding mean vector.  Additionally, a Stein-type inbiased risk estimate (SURE) criterion is developed to select NEST's tuning parameters.  We discuss NEST in terms of shrinkage estimators, its algorithm, its theory, and its numeric performance.