An Analysis Of The Framingham Heart Study Dataset
All diseases affecting the heart or blood vessels, including coronary heart disease (clogged arteries) fall under the category Cardiovascular Disease (C.V.D.). Coronary heart disease continues to be a leading cause of morbidity and mortality among adults in Europe and North America.
Data from the Framingham Heart study is analysed using SAS Enterprise Miner V14.1. The first goal of the study was to use data mining methods to identify the relationship between C.V.D. and health indicators including blood pressure rates and cholesterol level. The findings were then evaluated to formulate recommendations for continued proactive health guidance and monitoring. A range of risk prediction classifiers, including decision trees, neural networks and gradient boosting were developed to determine the likelihood of mortality as a function of provided health indicators. The second goal was to accurately assess the risk profiles of the study participants, with those individuals and groups most at risk then identified, helping formulate action with health guidelines for publication through appropriate channels and recommendation of treatment strategies.
Dataset split by gender and classified with neural network models deliver best performance in predicting mortality by cardiovascular disease. Recommendations include lifetime observation of participants, improving quality of data collection, targeting healthcare guidance and treatment, and a two-stage approach to operationalising the classifiers.