Scarcity of labels in non-stationary data streams: A survey
In a dynamic stream there is an assumption that the underlying process generating the stream is non-stationary and that concepts within the stream will drift and change as the stream progresses. Concepts learned by a classification model are prone to change and non-adaptive models are likely to deteriorate and become ineffective over time. The challenge of recognising and reacting to change in a stream is compounded by the scarcity of labels problem. This refers to the very realistic situation in which the true class label of an incoming point is not immediately available (or might never be available) or in situations where manually annotating data points is prohibitively expensive. In a high-velocity stream it is perhaps impossible to manually label every incoming point and pursue a fully-supervised approach. In this article we formally describe the types of change which can occur in a data-stream and then catalogue the methods for dealing with change when there is limited access to labels. We present an overview of the most influential ideas in the field along with recent advancements and we highlight trends, research gaps, and future research directions.
The file attached to this record is the author's final peer reviewed version.
Citation : Fahy, c., Yang, S. and Gongora, M. (2021) Scarcity of labels in non-stationary data streams: A survey. ACM Computing Surveys.
ISSN : 0360-0300
Research Institute : Institute of Artificial Intelligence (IAI)
Peer Reviewed : Yes