« previous | Wednesday, September 18, 2013 | next »
Random surfer
Transition Probability Matrix
"If I'm on (row), what's the probability that I will go to (col)"
Each row must sum to 1, but what about pages without any outlinks?
When surfer gets bored, magically pick a random page and continue walking. This is called the "I'm bored matrix" or the "teleport matrix":
Let's use to represent our "am I not bored?" coin. So I get bored at each page with probability .
For pages with no outlinks, we'll use a "hack" row from our teleport matrix (since our only way out is to teleport). Thus our "follow link matrix" becomes
Now we can apply our "am I bored?" parameter to represent the choices we will make:
This shall henceforth be informally known as the "final super mega" matrix.
Markov Property
"I'm forgetful. I forgot where I was. All I know is that I'm here, so what do I do now?"
Formally: Markov chains are stateless; no history
Encode current location as a vector. Suppose the surfer is at B:
Our next position will be , then
We continue to infinity, and our position magically converges. This converged vector becomes our Pagerank
Things to take away
- Now to blow our mind: it doesn't matter where we start,
- If we start with the converged vector, it converges in one step
Since start position actually doesn't matter; we could start simultaneously at all pages:
Bottom line:
Thus the PageRank vector is a left eigenvector of the transition matrix . (i.e. is an eigenvector of .)
Solving the Markov Chain
Recall the equations we came up with last time:
The represents our teleport probability from our matrix above.
Let's add subscripts to reflect the fact that the values change at each iteration.
We start our 0th iteration with all pagerank values equal to 1.
As , the values will converge to the same thing as our condition matrix.
Obviously, since the system of equations can be represented as a matrix.