Toward Improved Data Quality in Public Health: Analysis of Anomaly Detection Tools applied to HIV/AIDS Data in Africa
Date
Advisors
Journal Title
Journal ISSN
ISSN
DOI
Volume Title
Publisher
Type
Peer reviewed
Abstract
The study examined the data quality efficiency of the WHO Data QualityReview (DQR) toolkit and PyCaret anomaly detection algorithms. The tools were applied to the African HIV/AIDS data (2015-2021) extracted from a public data repository (data.pepfar.gov). The research outcome suggests that unsupervised anomaly detection algorithms could complement the efficiency of the WHO DQRtoolkit and improve Data Quality Assessment (DQA). In particular, the study showed that anomaly detection algorithms through python programming provide a more straightforward and more reliable process for detecting data inconsistencies, incompleteness, and timeliness and appears more accurate than the WHO tool. Consequently, the study contributed to ongoing debates on improving health data quality in low-income African countries