CSCE 470 Lecture 8

From Notes
Jump to navigation Jump to search

« previous | Wednesday, September 11, 2013 | next »


Measuring Performance

Suppose we have labels on query-document pairs (from a trusted source)—relevant or not relevant—for 100 M documents.

Note: Relevance may not apply to individual users (or at all, in the case of marginal relevance and mirroring), but at least it's an approximation

We have two engines, "mine" and "yours", that we wish to test against our document set.

For one query (e.g. "bob"), suppose

  • "mine" finds [ irrelevant, relevant, irrelevant, relevant, relevant ]
  • "yours" finds [ relevant, relevant ]

From this, we can get two metrics:

  • yours returned relevant results first (assume ordered search)
  • mine returned more relevant results overall

But one query is not good enough. Let's run 10,000 queries. Suppose that in reality, "yours" actually works better than "mine" on average.

Remember that there are 4 ways to measure:

  1. Precision
    • In the example query ("bob") above, "mine": were actually relevant; and "yours":
    • Take into account ranking (to an extent): precision at → "mine": , "yours":
  2. Recall
    • "mine": , "yours":
    • Suppose "mine" was "optimized" for recall and just returned all documents in the collection; now
  3. F-measure:
    • Note tension between precision and recall; it's possible for both to diverge by returning fewer or more documents
    • "mine": ; "yours":
  4. Normalized Discounted Cumulative Gain (NDCG)
    • Hottest measure right now
    • Sensitive to position of highest rated page
    • Log-discounting of results
    • Normalized for different lengths lists


NDCG

  • represents our result's rank
  • is normalization
  • is a cumulation
  • is gain
  • is Position discount

Let's define our gain label function as follows:

  • Perfect:
  • Excellent:
  • Average:
  • Fair:
  • Poor:


Example query: {abc}

discounting factor is

Rank URL Gain Cumulative Gain Discounted Cumulative Gain Max DCG NDCG
1 http://abc.go.com/ Perfect:
2 http://www.abcteach.com Fair:
3 http://abcnews.go.com/sections Excellent:
4 http://www.abc.net.au/ Excellent:
5 http://abcnews.go.com/ Excellent: