Tao He on Testing High-Dimensional Non-Parametric Functions With Application In SNP/Gene Set Analysis

Tao He, Assistant Professor, Department of Mathematics San Francisco State University

Title: Testing High-Demensional Non-Parametric Functions with Application in SNP/Gene Set Analysis

Summary: High-dimensional data arise nowadays in a wide range of areas, such as biology, imaging and climate. A common feature of high-dimensional data is that the number of features could be much larger than the sample size, the so-called “large p, small n” problem. A specific example in genomic studies is encountered when detecting the significant SNP sets or gene sets that are associated with certain trait. To model the systematic mechanism and potential complex interactions among genetic variants, we consider a flexible nonparametric function in a reproducing kernel Hilbert space. A test statistic is then proposed and its asymptotic distributions are studied under the null hypothesis and a series of local alternative hypotheses, under the “large p, small n” setting. The methods were demonstrated through extensive simulation studies and real data analysis

Susan Murphy - The Stratified Micro-Randomized Trial with Application to Mobile Health

Susan Murphy, Ph.D. Harvard University

Title: The Stratified Micro-Randomized Trial with Application to Mobile Health

Summary: Technological advancements in the field of mobile devices and wearable sensors make it possible to deliver treatments anytime and anywhere to users like you and me. Increasingly the delivery of these treatments is triggered by detections/predictions of vulnerability and receptivity. These observations are likely to have been impacted by prior treatments. Furthermore the treatments are often designed to have an impact on users over a span of time during which subsequent treatments may be provided. Here we discuss our work on the design of a mobile health smoking cessation study in which the above two challenges arose. This work involves the use of multiple online data analysis algorithms. Online algorithms are used in the detection, for example, of physiological stress. Other algorithms are used to forecast at each vulnerable time, the remaining number of vulnerable times in the day. These algorithms are then inputs into a randomization algorithm that ensures that each user is randomized to each treatment an appropriate number of times per day. We develop the stratified micro-randomized trial which involves not only the randomization algorithm but a precise statement of the meaning of the treatment effects and the primary scientific hypotheses along with primary analyses and sample size calculations. Considerations of causal inference and potential causal bias incurred inappropriate data analyses play a large role throughou