A Comparison of Classification Methods

A Comparison of Classification Methods

An Analytical Comparison

Mathematically: LDA vs QDA vs naive Bayes vs logistic regression

Setting K as the baseline class, we assign an observation to the class for k=1,…,K that maximizes

log(Pr(Y=k|X=x)Pr(Y=K|X=x))

For LDA:

log(Pr(Y=k|X=x)Pr(Y=K|X=x))=ak+∑j=1pbkjxjwhere ak=log(πkπK)−12(μk+μK)TΣ−1(μk−μK)and bkj is the jth component of Σ−1(μk−μK)

For QDA:

log(Pr(Y=k|X=x)Pr(Y=K|X=x))=ak+∑j=1pbkjxj+∑j=1p∑l=1pckjlxjxl

where ak, bkj , and ckjl are functions of πk, πK, μk, μK, Σk and ΣK
For naive Bayes:

log(Pr(Y=k|X=x)Pr(Y=K|X=x))=ak+∑j=1pgkj(xj)where ak=log(πkπK) and gkj(xj)=log(fkj(xj)fKj(xj))

For logistic regression:

log(Pr(Y=k|X=x)Pr(Y=K|X=x))=βk0+∑j=1pβkjxj

where β coefficients are chosen to maximize the likelihood function

Compared to KNN, which is a non-parametric approach:

An Empirical Comparison

Practically: LDA vs QDA vs naive Bayes vs logistic regression vs KNN

Which is better? It depends on your use case! Know how to test and compare each model's accuracy.

Sources: 1

Connect With Me!