Integrating Proteogenomics with Electronic Health Record Phenotypes for Chronic Kidney Disease Risk Prediction: Interpretability, Bias, and Real-World Performance

Atukunda Derrick 

Department of Pharmaceutical Microbiology and Biotechnology Kampala International University Uganda

Email: derrick.atukunda@studwc.kiu.ac.ug

ABSTRACT

Chronic kidney disease (CKD) is a major global health burden characterized by high morbidity, mortality, and frequent underdiagnosis in early stages. Accurate and interpretable risk prediction models are essential for improving early detection and guiding preventive interventions. This study explores the integration of proteogenomic biomarkers with electronic health record (EHR)-derived phenotypes to enhance CKD risk prediction, while examining interpretability, bias, and real-world performance. Using longitudinal EHR data combined with genomic and plasma proteomic features, we evaluate baseline statistical and machine-learning models, including logistic regression, gradient-boosted trees, and neural networks, alongside interpretable frameworks such as generalized additive models and Shapley-based feature attribution. The analysis assesses whether proteogenomic features improve predictive accuracy, mitigate bias across demographic groups, and provide clinically meaningful biological insights beyond traditional EHR-only models. Findings suggest that while EHR-derived phenotypes remain strong predictors of CKD progression, proteogenomic integration offers modest but meaningful improvements in biological interpretability, bias reduction, and subgroup stratification. Real-world deployment considerations, including data heterogeneity, computational constraints, privacy governance, and cross-population generalizability, are discussed. The study highlights the potential of integrative multi-omic–EHR frameworks to advance precision nephrology, while emphasizing the need for standardized validation, responsible AI practices, and scalable clinical implementation strategies.

Keywords: Chronic Kidney Disease (CKD), Proteogenomics, Electronic Health Records (EHR), Risk Prediction Models, and Interpretable Machine Learning.

CITE AS: Atukunda Derrick (2026). Integrating Proteogenomics with Electronic Health Record Phenotypes for Chronic Kidney Disease Risk Prediction: Interpretability, Bias, and Real-World Performance. RESEARCH INVENTION JOURNAL OF SCIENTIFIC AND EXPERIMENTAL SCIENCES 6(1):55-66. https://doi.org/10.59298/RIJSES/2026/615566