Parallel and Distributed Systems Group

Computer Science Department of Telecom SudParis

Leveraging Bagging for Evolving Data Streams

Reading group: Ewa Turska presented "Leveraging Bagging for Evolving Data Streams" (ECML PKDD'10) at 1C27 the 18/11/2022 at 10h00.

Abstract

Bagging, boosting and Random Forests are classical ensemble methods used to improve the performance of single classifiers. They obtain superior performance by increasing the accuracy and diversity of the single classifiers. Attempts have been made to reproduce these methods in the more challenging context of evolving data streams. In this paper, we propose a new variant of bagging, called leveraging bagging. This method combines the simplicity of bagging with adding more randomization to the input, and output of the  classifiers. We test our method by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples.