The Convergence Rate of Majority Vote Under Exchangeability
Miles Lopes, University of California, Berkeley
Majority vote plays a fundamental role in many applications of statistics, such as ensemble classifiers, crowdsourcing, and elections. When using majority vote as a prediction rule, it is of basic interest to ask “How many votes are needed to obtain a reliable prediction?” For instance, in the context of ensemble classifiers, this question determines the algorithmic convergence rate and computational cost of an ensemble. In this paper, we provide a precise answer for the specific cases of Bagging and Random Forests: If err_t denotes the test error achieved by the majority vote of t>1 classifiers, and err* denotes its nominal limiting value, then under basic regularity conditions, err_t = err* + c/t +o(1/t), where c is a constant given by a simple formula. More generally, we show that if V_1,V_2,... is an exchangeable Bernoulli sequence with mixture distribution F, and the majority vote is written as M_t = median(V_1,...,V_t), then 1-E[M_t] = F(1/2)+(1/8)F''(1/2)/t + o(1/t) when F is sufficiently smooth.
Abstract Author(s): Miles Lopes