Linear Regression vs K-Nearest Neighbors

Simplest and best known non-parametric method: K-nearest neighbors regression (KNN regression).

Given a value for $K$ and a prediction point $x_{0}$ , KNN regression first identifies the $K$ training observations that are closest to $x_{0}$ , represented by $N_{0}$ . It then estimates $f (x_{0})$ using the average of all the training responses in $N_{0}$ . In other words:

\hat{f} (x_{0}) = \frac{1}{K} \sum_{x_{i} \in N_{0}} y_{i}

In general, the optimal value for $K$ will depend on the bias-variance tradeoff. Test error rates discussed in Ch 5.

The parametric approach will outperform the non-parametric approach if the parametric form that has been selected is close to the true form of $f$ .

Note that as the extent of non-linearity increases, there is little change in the test set MSE for the non-parametric KNN method, but there is a large increase in the test set MSE of linear regression.

Even when the true relationship is highly non-linear, KNN may still provide inferior results to linear regression. In higher dimensions, KNN often performs worse than linear regression. This decrease in performance as the dimension increases is a common problem for KNN, and results from the fact that in higher dimensions there is effectively a reduction in sample size.

As a general rule, parametric methods will tend to outperform non-parametric approaches when there is a small number of observations per predictor.

Even when the dimension is small, we might prefer linear regression to KNN from an interpretability standpoint. If the test MSE of KNN is only slightly lower than that of linear regression, we might be willing to forego a little bit of prediction accuracy for the sake of a simple model that can be described in terms of just a few coefficients, and for which $p$ -values are available.

Sources: 1

Linear Regression vs K-Nearest Neighbors

Connect With Me!