Tao Wang - Colloquium Speaker

Assistant Professor, Department of Economics, University of Victoria
Thursday, February 22, 2024 - 3:30pm
Colloquium Title: 
Distributed Learning for Kernel Mode-Based Regression
Meet and Greet at 3:00 pm in 241 SH / Talk at 3:30 pm in 61 SH


We in this paper propose a parametric kernel mode-based regression built on the mode value, which can achieve robust and efficient estimators for datasets containing outliers or featuring heavy-tailed distributions. We show that the resultant estimators can arrive at the highest asymptotic breakdown point of 0.5. To address the challenges posed by massive datasets, we then integrate this regression method with distributed statistical learning techniques, which greatly reduces the required amount of primary memory and simultaneously accommodates heterogeneity in the estimation process. By approximating the local kernel objective function with a least squares format, we are able to preserve compact statistics for each worker machine, facilitating the reconstruction of estimates for the entire dataset with minimal asymptotic approximation error. With the help of a Gaussian kernel, we introduce an iteration algorithm based on the expectation-maximization procedure to substantially reduce the computational burden. We establish the asymptotic properties of the developed mode-based estimators, demonstrating that the suggested estimator for massive datasets is statistically as efficient as the global mode-based estimator using the full dataset. Additionally, we explore shrinkage estimation through local quadratic approximation, showcasing that the resulting estimator possesses the oracle property through an adaptive LASSO approach. The finite sample performance of the developed method is illustrated using simulations as well as real data analysis.