Research

Recent Publication by Sanders-Brown Researchers Looks at Issue of Data Redundancy in Machine Learning

Mark Ebbert and his research assistant on August 5, 2021. Photo by Pete Comparoni | UKphoto
Mark Ebbert and his research assistant on Aug. 5, 2021. Pete Comparoni | UK Photo.

LEXINGTON, Ky. (Dec. 1, 2021) — Work by a group of researchers at the University of Kentucky’s Sanders-Brown Center on Aging was recently published in Genes. The article looks at the use of data mining and machine learning in research.

The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (magnetic resonance imaging (MRI), biometrics, RNA expression, etc.) from Alzheimer’s disease cases and controls that have recently been used by machine learning algorithms to evaluate Alzheimer’s disease onset and progression. While using a variety of biomarkers is essential to Alzheimer’s disease research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models.

Justin Miller, Ph.D., assistant professor in the UK College of Medicine, directed this work through a collaboration with Mark Ebbert, Ph.D., assistant professor in the UK College of Medicine, and staff scientists Erik Huckvale and Matthew Hodgman. Together, they used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. Miller says through this work they found that greater than 90% of the biomarkers, gene expression data, and MRI data included in the ADNI dataset are very highly correlated with at least one other datatype, which could provide unforeseen challenges in using machine learning to identify patterns across the diverse data that are available in that dataset.

In this publication, Miller and his colleagues provide mappings of the highly correlated features so that future studies can consider this feature correlation and improve machine learning accuracy and efficiency in Alzheimer’s disease research.

“Feature correlation has always been an issue in large datasets, but it was previously unknown the extent to which this issue permeated the Alzheimer’s Disease Neuroimaging dataset,” said Miller. “This research will help improve data mining accuracy and efficiency in the ADNI dataset. Machine learning is a promising avenue of research to identify patterns that can one day improve patient care. This research lays the groundwork for those future analyses.”

This work was supported by the BrightFocus Foundation under Award Number A2020118F.  Research reported in this publication was also supported by the National Institute of Aging of the National Institutes of Health under Award Numbers R01AG046171, RF1AG051550 and 3U01AG024904-09S4. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

As the state’s flagship, land-grant institution, the University of Kentucky exists to advance the Commonwealth. We do that by preparing the next generation of leaders — placing students at the heart of everything we do — and transforming the lives of Kentuckians through education, research and creative work, service and health care. We pride ourselves on being a catalyst for breakthroughs and a force for healing, a place where ingenuity unfolds. It's all made possible by our people — visionaries, disruptors and pioneers — who make up 200 academic programs, a $476.5 million research and development enterprise and a world-class medical center, all on one campus.   

In 2022, UK was ranked by Forbes as one of the “Best Employers for New Grads” and named a “Diversity Champion” by INSIGHT into Diversity, a testament to our commitment to advance Kentucky and create a community of belonging for everyone. While our mission looks different in many ways than it did in 1865, the vision of service to our Commonwealth and the world remains the same. We are the University for Kentucky.