Traditional statistics versus machine learning in clinical registries: A pragmatic workflow for matching methods to data and clinical questions

Authors

DOI:

https://doi.org/10.24170/26-2-8319

Abstract

This piece discusses the importance of data type, identification, and organisation for machine learning (ML) and neural network (NN) development, and the applicability of ML for statistical analysis in large clinical and physiological datasets, such as the South African Heart Association Registry (SHARE).

Core outcomes/key lessons

To enable clinicians and researchers to:

  • Systematically assess their clinical dataset (registry data, e.g. SHARE) for variable types, dimensionality, sample size, missingness, and event rates.
  • Understand when traditional statistical methods are sufficient, when regularised regression is preferable, and when more complex ML approaches are justified.
  • Recognise common pitfalls (overfitting, multicollinearity, data leakage, mis-specified outcomes), and how to avoid them in both “classic” and ML settings.
  • Apply a staged workflow to their own data, using the SHARE-transcatheter aortic valve implantation (TAVI) registry as an illustrative case.

Downloads

Download data is not yet available.

Author Biographies

A Wentzel, North-West University

Hypertension in Africa Research Team (HART), North-West University, Potchefstroom, South Africa
South African Medical Research Council Unit for Hypertension and Cardiovascular Disease, North-West University, Potchefstroom, South Africa

E Schaafsma, South African Heart Association

SHARE Registry Projects, South African Heart Association, Johannesburg, South Africa 

M Blignaut, Stellenbosch University

Centre for Cardio-Metabolic Research in Africa (CARMA), Division of Medical Physiology, Department of Biomedical Sciences, Stellenbosch University, Tygerberg, South Africa

Downloads

Published

2026-05-08

How to Cite

Wentzel, A., Schaafsma, E., & Blignaut, M. (2026). Traditional statistics versus machine learning in clinical registries: A pragmatic workflow for matching methods to data and clinical questions. SA Heart Journal, 26(2), 97–102. https://doi.org/10.24170/26-2-8319

Issue

Section

Statistics made easy