Random forests have been one of the successful ensemble algorithms in machine learning. The basic idea is to construct a large number of random trees individually and make prediction based on an average of their predictions. The great successes have attracted much attention on the consistency of random forests, mostly focusing on regression. This work takes one step towards convergence rates of random forests for classification. We present the first finite-sample rate O(n^{-1/(8d+2)}) on the convergence of pure random forests for classification, which can be improved to be of O(n^{-1/(3.87d+2)}) by considering the midpoint splitting mechanism. We introduce another variant of random forests, which follow Breiman's original random forests but with different mechanisms on splitting dimensions and positions. We get a convergence rate O(n^{-{1}/(d+2)}(\ln n)^{{1}/(d+2)}) for the variant of random forests, which reaches the minimax rate, except for a factor (\ln n)^{{1}/(d+2)}, of the optimal plug-in classifier under the L-Lipschitz assumption. We achieve tighter convergence rate O(\sqrt{\ln n/n}) under proper assumptions over structural data.