CSCE 470 Lecture 21

From Notes
Jump to navigation Jump to search

« previous | Friday, October 11, 2013 | next »


Classifications

Rocchio

Finds "separating hyperplane" between classes of data (think of a Voronoi Diagram)

Naïve Bayes

Still given a set of documents and a corpus of documents


Bayes' Rule


Application

Think of as a class and as a document.

We want to estimate for each class and a document .

In regards to Bayes' Theorem, we do a little trick by assuming all documents have the same probability (i.e. doesn't matter).

Thus .

We want to find the best class that gives us the highest probability

Where MAP stands for Maximum A Posteriori

Training Data

Suppose we have documents in class for a corpus size .

We can estimate the probability of a class as

We can estimate by analyzing whether each term in document is from a certain class: