Computational methods for optimal stratified sampling of ECE outcomes, fees and household income in South Africa

Authors

  • Georgi Borros Southern Africa Labour and Development Research Unit, University of Cape Town; Department of Statistical Sciences, University of Cape Town
  • Şebnem Er Department of Statistical Sciences, University of Cape Town
  • Sulaiman Salau Department of Statistical Sciences, University of Cape Town

DOI:

https://doi.org/10.37920/sasj.2026.60.1.1

Keywords:

Sampling, Stratification, Survey methodology

Abstract

In South Africa, the country with the highest income inequality in the world and an unemployment rate of 32.9%, survey research forms a crucial enabler for evidence-based policymaking to drive inclusive growth (Francis andWebster, 2019; Statistics South Africa, 2025a). For decades, nationally representative household surveys have provided the country with key information on livelihoods and areas for targeted decision making by capturing measures such as household income, food expenditure and employment status. More recently, notable survey research has been undertaken in early childhood education (ECE) – seen through the Thrive by Five Index, a nationally representative survey of child outcomes using ECE tools developed and validated for the South African context. There are several computational methods developed in the literature offering solutions for optimum boundary determination and sample size allocation in the stratified sampling approach to survey research. The uptake of computational methods in the South African context, however, remains limited. Our study offers the first quantitative evaluation of more common stratification approaches used in South Africa in comparison to five prominent computational methods in the literature – random search, genetic algorithm, biased random key genetic algorithm, grouping genetic algorithm, and variable neighbourhood search. The findings indicate that substantial precision gains can be realised when adopting these novel methods. Additionally, this study is the first application of the methods to South African datasets, contributing to the literature using notable, recent research use cases in the country: the Thrive by Five Index 2021, ECD Census 2021, and General Household Surveys of 2023 and 2024. Through comprehensive evaluation, the work offers insights for performing stratified sampling in these applied contexts using existing methods available in the R programming language.

Downloads

Download data is not yet available.

Downloads

Published

2026-03-30

How to Cite

Borros, G., Er, Şebnem, & Salau, S. (2026). Computational methods for optimal stratified sampling of ECE outcomes, fees and household income in South Africa. South African Statistical Journal, 60(1), 1–24. https://doi.org/10.37920/sasj.2026.60.1.1

Issue

Section

Research Articles