csc 591-024, (8290)
csc 791-024, (8291)
fall 2024, special topics in computer science
Tim Menzies, timm@ieee.org, com sci, nc state
Auguest 2024
A Bayes classifier is a simple statistical-based learning scheme.
Advantages:
Assumptions
Although based on assumptions that are almost never correct, this scheme works well in practice 1.
like(B)
^
6 | /
| /
| /
3 | / <-- region where L(A) > L(B)
| /
|/
.------> like(A)
0 3 6
Aside: note the zone of confusion
outlook temperature humidity windy play
------- ----------- -------- ----- ----
rainy cool normal TRUE no
rainy mild high TRUE no
sunny hot high FALSE no
sunny hot high TRUE no
sunny mild high FALSE no
overcast cool normal TRUE yes
overcast hot high FALSE yes
overcast hot normal FALSE yes
overcast mild high TRUE yes
rainy cool normal FALSE yes
rainy mild high FALSE yes
rainy mild normal FALSE yes
sunny cool normal FALSE yes
sunny mild normal TRUE yes%%
This data can be summarized as follows:
Outlook Temperature Humidity
==================== ================= =================
Yes No Yes No Yes No
Sunny 2 3 Hot 2 2 High 3 4
Overcast 4 0 Mild 4 2 Normal 6 1
Rainy 3 2 Cool 3 1
----------- --------- ----------
Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5
Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5
Rainy 3/9 2/5 Cool 3/9 1/5
Windy Play
================= ========
Yes No Yes No
False 6 2 9 5
True 3 3
---------- ----------
False 6/9 2/5 9/14 5/14
True 3/9 3/5
So, what happens on a new day:
First find the likelihood of the two classes
Conversion into a probability by normalization:
So, we aren’t playing golf today.
For numeric columns:
@of("If `x` is known, add this COL.")
def add(self:COL, x:any) -> any:
if x != "?":
self.n += 1
self.add1(x)
@of("add symbol counts.")
def add1(self:SYM, x:any) -> any:
self.has[x] = self.has.get(x,0) + 1
if self.has[x] > self.most: self.mode, self.most = x, self.has[x]
return x
@of("add `mu` and `sd` (and `lo` and `hi`). If `x` is a string, coerce to a number.")
def add1(self:NUM, x:any) -> number:
self.lo = min(x, self.lo)
self.hi = max(x, self.hi)
= x - self.mu
d self.mu += d / self.n
self.m2 += d * (x - self.mu)
self.sd = 0 if self.n <2 else (self.m2/(self.n-1))**.5
So here’s the NB classifier:
@of("How much DATA likes a `row`.")
def loglike(self:DATA, r:row, nall:int, nh:int) -> float:
= (len(self.rows) + the.k) / (nall + the.k*nh)
prior = [c.like(r[c.at], prior) for c in self.cols.x if r[c.at] != "?"]
likes return sum(log(x) for x in likes + [prior] if x>0)
@of("How much a SYM likes a value `x`.")
def like(self:SYM, x:any, prior:float) -> float:
return (self.has.get(x,0) + the.m*prior) / (self.n + the.m)
@of("How much a NUM likes a value `x`.")
def like(self:NUM, x:number, _) -> float:
= self.sd**2 + 1E-30
v = exp(-1*(x - self.mu)**2/(2*v)) + 1E-30
nom = (2*pi*v) **0.5
denom return min(1, nom/(denom + 1E-30))
Pedro Domingos and Michael Pazzani. 1997. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Mach. Learn. 29, 2-3 (November 1997), 103-130↩︎