Assistant Professor, Department of Mathematics
San Francisco State University
Summary: High-dimensional data arise nowadays in a wide range of areas, such as biology, imaging and climate. A common feature of high-dimensional data is that the number of features could be much larger than the sample size, the so-called “large p, small n” problem. A specific example in genomic studies is encountered when detecting the significant SNP sets or gene sets that are associated with certain trait. To model the systematic mechanism and potential complex interactions among genetic variants, we consider a flexible nonparametric function in a reproducing kernel Hilbert space. A test statistic is then proposed and its asymptotic distributions are studied under the null hypothesis and a series of local alternative hypotheses, under the “large p, small n” setting. The methods were demonstrated through extensive simulation studies and real data analysis.
Followed by Social Hour