| Literature DB >> 31666329 |
Yuxin Chen1, Jianqing Fan2, Cong Ma2, Yuling Yan2.
Abstract
Noisy matrix completion aims at estimating a low-rank matrix given only partial and corrupted entries. Despite remarkable progress in designing efficient estimation algorithms, it remains largely unclear how to assess the uncertainty of the obtained estimates and how to perform efficient statistical inference on the unknown matrix (e.g., constructing a valid and short confidence interval for an unseen entry). This paper takes a substantial step toward addressing such tasks. We develop a simple procedure to compensate for the bias of the widely used convex and nonconvex estimators. The resulting debiased estimators admit nearly precise nonasymptotic distributional characterizations, which in turn enable optimal construction of confidence intervals/regions for, say, the missing entries and the low-rank factors. Our inferential procedures do not require sample splitting, thus avoiding unnecessary loss of data efficiency. As a byproduct, we obtain a sharp characterization of the estimation accuracy of our debiased estimators in both rate and constant. Our debiased estimators are tractable algorithms that provably achieve full statistical efficiency.Entities:
Keywords: confidence intervals; convex relaxation; nonconvex optimization
Mesh:
Year: 2019 PMID: 31666329 PMCID: PMC6859358 DOI: 10.1073/pnas.1910053116
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Notation used to unify the convex estimate and the nonconvex estimate
| Either | |
| For the nonconvex case, we take | |
| The proposed debiased estimator as in | |
| The proposed estimator as in |
Here, is the best rank- approximation of . See for a complete summary.
Fig. 2.(Left) Estimation error of vs. measured in the Frobenius norm. (Right) Estimation error of vs. measured in the norm. The results are averaged over 20 independent trials for , , and .
Empirical coverage rates of for different over 200 Monte Carlo trials
| 0.9380 | 0.0200 | |
| 0.9392 | 0.0196 | |
| 0.9455 | 0.0164 | |
| 0.9456 | 0.0164 | |
| 0.9226 | 0.0247 | |
| 0.9271 | 0.0228 | |
| 0.9410 | 0.0173 | |
| 0.9417 | 0.0172 |
Fig. 1.Q-Q plots of (Left) and (Right) vs. the standard normal distribution. The results are reported over 200 independent trials for , , and .
Empirical coverage rates and average lengths of the confidence intervals of the entries as well as the estimation error vs. observation probability
| CI length | ||||||
| Mean | SD | Mean | SD | Convex | Debiased | |
| 0.5 | 0.8265 | 0.0016 | 3.6698 | 0.0209 | 0.029 | 0.028 |
| 0.6 | 0.8268 | 0.0011 | 2.8774 | 0.0098 | 0.025 | 0.023 |
| 0.7 | 0.8431 | 0.0006 | 2.3426 | 0.0054 | 0.022 | 0.019 |
| 0.8 | 0.8725 | 0.0003 | 2.0234 | 0.0052 | 0.020 | 0.015 |
| 0.9 | 0.9093 | 0.0003 | 1.8296 | 0.0072 | 0.018 | 0.011 |
The results are averaged over 20 Monte Carlo trials.
Gradient descent for solving Eq.
| where |