Enhancing Disease Diagnosis in Healthcare Using LSTM Networks with Optimized Feature Selection: A Comparative Research on Heart Disease, Breast Cancer, and Liver Disease Datasets
Abstract
Clinical diagnosis relies heavily on accurate and timely medical assessment. However, clinical data sets are typically large, incomplete, and noisy, therefore limiting the reliability of traditional models used to diagnose clinical conditions. To address these limitations, this study developed an integrated model combining an optimised feature selection strategy with a Long Short-Term Memory (LSTM) model to provide improved accuracy across three clinical condition benchmarks, namely heart disease, breast cancer, and liver disorder. The first step in this process was feature selection, which involved selecting only those features that were deemed clinically relevant. By doing this, the subsequent LSTM model was able to recognise temporal patterns within the patient's clinical data set more easily than it would have been able to do using the original clinical data set. Results from this work demonstrates that the proposed methodology resulted in higher accuracy compared to baseline classifiers and also to non-optimised LSTM models across all three data sets. Specifically, the proposed method had an accuracy of 91.5% for heart disease, 97.6% for breast cancer, with a Receiver Operating Characteristic Area Under the Curve (ROC-AUC) of 0.99, and 83.9% accuracy for liver disorder. Additionally, there was a clear improvement in terms of recall and F1-score. Overall, the results from this study demonstrate that the integration of a method for dimensionality reduction with a sequential learning approach produces a reliable and generalizable clinical diagnostic model that can support clinical decision-making in various health care settings.
Downloads
Copyright (c) 2026 ITEGAM-JETIA

This work is licensed under a Creative Commons Attribution 4.0 International License.








