COLUMBIA UNIVERSITY HEALTH SCIENCES
Identifying Huntingtons disease markers by modern statistical learning methods.
learning strategy, Disease Marker, Huntington Disease, Machine Learning, CAG repeat
DESCRIPTION (provided by applicant): Designing an efficient Huntington’s disease (HD) early intervention clinical trial for individuals who have an expanded CAG repeats in the huntingtin gene requires identifying and combining clinical, biological, cognitive, and brain imaging markers to accurately distinguish among subjects who will have a diagnosis during a given intervention period and those who will not, and to track early changes in the disease course. The goal of this project is to identify sensitive biomarkers for HD risk stratification, indexing disease progression, and developing clinical trial endpoints. The proposal directly adheres to “”2P’s”” of the NIH New Strategic Vision of the “”4P’s”” of Medicine: they will offer promising ways to predict when the disease will develop; and increase the capacity to personalize early intervention based on the informative patient-specific markers our models identify. Combining biomarkers to predict HD onset and progression is an essential step in a continuum of research for development of disease-modifying therapies. Composite markers and their risk profiles created from our model will offer quantitative way to monitor and compare potential interventions. Evidence collected from these comparisons will advance the development of efficacy studies in premanifest HD, where neuroprotective treatments would be most beneficial. We develop and apply a series of cutting-edge statistical learning methods based on support vector machine (SVM), variable selection, and dimension reduction to achieve these goals. These modern statistical methods designed for correlated big data have quickly emerged as among the most successful tools for hypothesis generation, classification and prediction in biomedical studies. However, they have not been introduced to HD biomarker research. In aim 1, using counting process, we propose SVM to handle time-to-event outcomes (e.g., time-to-HD-diagnosis) to combine markers into risk scores to discriminate subjects who will experience HD onset in the immediate future from those who will not, based on their personalized features. Although SVM is well studied for binary outcomes, it is far less explored for time-to-event outcomes. We fill this gap in knowledge. In aim 2, we propose new learning methods for longitudinal outcomes to combine markers that modify the course of HD signs to monitor disease process and distinguish subjects with rapid progression from those with slower progression. In aim 3, we propose to use novel and robust performance measures to compare derived combined markers with existing disease indices and key markers. These aims will fundamentally advance our understanding of markers linked to HD onset and progression. The creation of statistical models for composite markers and risk profiles is especially useful in: (1) offering quantitative ways to monitor and compare potential interventions, and (2) improving power of efficacy studies targeted at premanifest individuals by narrowing the predictive interval which leads to future clinical trials that can be made shorter with fewer subjects. Finally, our improved predictions of HD onset and progression will provide more informative genetic counseling sessions for pre-symptomatic subjects at risk of HD.
PUBLIC HEALTH RELEVANCE: The goal of Huntington’s disease (HD) research is to develop experimental therapeutics to delay onset or slow disease progression, and to provide different treatment regimens at each disease stage. To meet this goal, this proposal develops and applies a series of advanced statistical approaches to rank and combine clinical, behavioral, and brain imaging markers to predict HD diagnosis in premanifest subjects during a given time period and to measure disease progression. The creation of model for composite markers and risk profiles is useful in offering quantitative ways to monitor and compare interventions and powering clinical trials for premanifest HD individuals.