NIPS Poster Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers

Poster

Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers

Veeranjaneyulu Sadhanala · Yu-Xiang Wang · Ryan Tibshirani

Area 5+6+7+8 #38

Keywords: [ (Other) Statistics ] [ (Other) Regression ] [ Sparsity and Feature Selection ]

[ Abstract ]

Abstract:

We consider the problem of estimating a function defined over n locations on a d-dimensional grid (having all side lengths equal to n^(1/d)). When the function is constrained to have discrete total variation bounded by Cn, we derive the minimax optimal (squared) L_2 estimation error rate, parametrized by n, Cn. Total variation denoising, also known as the fused lasso, is seen to be rate optimal. Several simpler estimators exist, such as Laplacian smoothing and Laplacian eigenmaps. A natural question is: can these simpler estimators perform just as well? We prove that these estimators, and more broadly all estimators given by linear transformations of the input data, are suboptimal over the class of functions with bounded variation. This extends fundamental findings of Donoho and Johnstone (1998) on 1-dimensional total variation spaces to higher dimensions. The implication is that the computationally simpler methods cannot be used for such sophisticated denoising tasks, without sacrificing statistical accuracy. We also derive minimax rates for discrete Sobolev spaces over d-dimensional grids, which are, in some sense, smaller than the total variation function spaces. Indeed, these are small enough spaces that linear estimators can be optimal---and a few well-known ones are, such as Laplacian smoothing and Laplacian eigenmaps, as we show. Lastly, we investigate the adaptivity of the total variation denoiser to these smaller Sobolev function spaces.

Live content is unavailable. Log in and register to view live content