TOWARD A CONNECTIVITY GRADIENT-BASED FRAMEWORK FOR REPRODUCIBLE BIOMARKER DISCOVERY
Hong S-J., Xu T., Nikolaidis A., Smallwood J., Margulies DS., Bernhardt B., Vogelstein J., Milham MP.
<jats:title>A<jats:sc>bstract</jats:sc></jats:title><jats:p>Despite myriad demonstrations of feasibility, the high dimensionality of fMRI data remains a critical barrier to its utility for reproducible biomarker discovery. Recent studies applying dimensionality reduction techniques to resting-state fMRI (R-fMRI) have unveiled neurocognitively meaningful connectivity gradients that are present in both human and primate brains, and appear to differ meaningfully among individuals and clinical populations. Here, we provide a critical assessment of the suitability of connectivity gradients for biomarker discovery. Using the Human Connectome Project (discovery subsample=209; two replication subsamples= 209×2) and the Midnight scan club (n=9), we tested the following key biomarker traits – reliability, reproducibility and predictive validity – of functional gradients. In doing so, we systematically assessed the effects of three analytical settings, including <jats:italic>i</jats:italic>) dimensionality reduction algorithms (<jats:italic>i.e</jats:italic>., linear <jats:italic>vs</jats:italic>. non-linear methods), <jats:italic>ii</jats:italic>) input data types (<jats:italic>i.e</jats:italic>., raw time series, [un-]thresholded functional connectivity), and <jats:italic>iii</jats:italic>) amount of the data (R-fMRI time-series lengths). We found that the reproducibility of functional gradients across algorithms and subsamples is generally higher for those explaining more variances of whole-brain connectivity data, as well as those having higher reliability. Notably, among different analytical settings, a linear dimensionality reduction (principal component analysis in our study), more conservatively thresholded functional connectivity (<jats:italic>e.g</jats:italic>., 95-97%) and longer time-series data (at least ≥20mins) was found to be preferential conditions to obtain higher reliability. Those gradients with higher reliability were able to predict unseen phenotypic scores with a higher accuracy, highlighting reliability as a critical prerequisite for validity. Importantly, prediction accuracy with connectivity gradients exceeded that observed with more traditional edge-based connectivity measures, suggesting the added value of a low-dimensional gradient approach. Finally, the present work highlights the importance and benefits of systematically exploring the parameter space for new imaging methods before widespread deployment.</jats:p><jats:sec><jats:title>H<jats:sc>ighlights</jats:sc></jats:title><jats:list list-type="simple"><jats:list-item><jats:p>- There is a growing need to identify benchmark parameters in advancing functional connectivity gradients into a reliable biomarker.</jats:p></jats:list-item><jats:list-item><jats:p>- Here, we explored multidimensional parameter space in calculating functional gradients to improve their reproducibility, reliability and predictive validity.</jats:p></jats:list-item><jats:list-item><jats:p>- We demonstrated that more reproducible and reliable gradient markers tend to have higher predictive power for unseen phenotypic scores across various cognitive domains.</jats:p></jats:list-item><jats:list-item><jats:p>- We showed that the low-dimensional connectivity gradient approach could outperform raw edge-based analyses in terms of predicting phenotypic scores.</jats:p></jats:list-item><jats:list-item><jats:p>- We highlight the necessity of optimizing parameters for new imaging methods before their widespread deployment.</jats:p></jats:list-item></jats:list></jats:sec>