CSCE 470 Lecture 25

From Notes
Jump to navigation Jump to search

« previous | Wednesday, October 23, 2013 | next »


Privacy

Just covered this stuf...

Root-Mean-Square Error (RMSE)

(See wikipedia:Root-mean-square deviation→)


Used for evaluating the effectiveness of a model/estimator (or in our case, a recommender based on ratings)

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle RMSE = \sqrt{\frac{\sum \left( \mbox{predicted} - \mbox{actual} \right)^2}{n}}}

Given matrix of data:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{bmatrix} 4 & & 3 & \\ 3 & 4 & 2^* & \\ 3 & 5^* & 5 & \\ & & 4^* & 5 \\ \end{bmatrix}}

The starred items are not known to the algorithm. Suppose the algorithm gives the following scores:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{bmatrix} 4 & & 3 & \\ 3 & 4 & 3 & \\ 3 & 2 & 5 & \\ & & 1 & 5 \\ \end{bmatrix}}

The RMSE is

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sqrt{\frac{(3-2)^2 + (2-5)^2 + (1-4)^2}{3}} \approx 2.516611478}

The goal is to minimize this value. If the value is 1, then our recommender was off by one for every single value.

Introspection

Does rating really equal enjoyment?

Well, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mbox{rating} = \mbox{enjoyment}(\mbox{context})}

Recommender Systems

Collaborative Filtering

  1. Take everybody, find average rating
  2. Find KNN (see below) and take their rating

To find similarity (as in KNN), we need:

  1. representation (usually a vector)
    • ratings
    • Watched-or-not (binary)
    • distance from average
  2. a way to compare
    • manhattan
    • Euclidean
    • Cosine
    • Jaccard

Once we've found several similar users, how should we give score based on values?

  • average
  • max (optimistic)
  • min (don't suck)
  • weighted average

How could Clustering Help us?

Find groups of similar people that have similar tastes