CSCE 470 Lecture 32

« previous | Wednesday, November 13, 2013 | next »

Learning To Rank

Last time was just "learning relevance"

Machine Learning and Ranked Info Retrieval have been around for a long time... so why didn't the Machine Learning and Ranking communities get together earlier?

"Progressive development" — We're smarter now than all of the human race was $x$ years ago.
They didn't know about each other
The people didn't have access to much training data
Just takes time to be appreciated

What we talked about (relevant or not) is just classic classification: mapping to an unordered set of classes.

Solution:

Regression problems: map to a real value
Ordinal regression: map to an ordered set of classes (buckets)

Short Answer
Work-out Question
Thinking Question
Synthesis Question (put things together)

Assume we have categories $C$ of relevance exist with $C_{1}<C_{2}<\dots <C_{j}$ .

Assume training data is available consisting of document-query pairs represented as feature vectos $\psi _{i}$ and relevance ranking $C_{i}$

Two ways:

~~point-wise learning~~
pair-wise learning

Pair-Wise Learning

Main Idea: Take pairs of documents and determine which document is better.

Construct vector of features $\psi _{j}=\psi (d_{j},q)$

\phi (d_{i},d_{j},q)=\psi (d_{i},q)-\psi (d_{j},q)

Now training data consists of two documents $d_{i}$ and $d_{j}$ .

Example

Title	query	cosine	pagerank	loadtime	label
My Blog	johnny football	0.2	0.1	0.01	Poor
ESPN	johnny football	0.3	0.2	0.01	Excellent

Calculate Differences

Title I	Title J	Δ cosine	Δ pagerank	Δ loadtime	label
My Blog	ESPN	johnny football	-0.1	-0.1	0	J
My Blog	Your Blog	johnny football	0.0	0.1	0.01	I

Now we have:

A comparator to order between documents for a query
A classifier for I or J (like what we covered last time)

Summary

Ultimately beats traditional hand-designed ranking functions (hand-designed functions are included as features)

CSCE 470 Lecture 32

Contents

Learning To Rank

Pair-Wise Learning

Example

Summary

Navigation menu

Search