推荐系统 (Recommendation System)
Problem Formulation
$r(i,j)=1$ if user has rated movie (0 otherwise)
$y(i,j)=$ rating by user $j$ on movie $i$ (if defined)
$\theta^{(j)} =$ parameter vector for user $j$
$x^{(i)}=$ feature vector for movie $i$
$(\theta^{(j)})^Tx^{(i)} =$ predicted rating for user $j$ on movie $i$
$m^{(j)} =$ number of movies rated by user $j$
Content-based Recommendation
Optimization objective
To learn $\theta^{(j)}$(parameter for user $j$)
$$ \begin{equation} \min_{\theta^{(j)}}{1\over 2}\sum_{i:r(i,j)=1}({(\theta^{(j)})^Tx^{(i)} – y^{(i,j)} })^2 + {\lambda\over 2}\sum_{k=1}^n(\theta_k^{(j)})^2 \end{equation} $$
To learn $\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)}$
$$ \begin{equation} \min_{\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)}}{1\over 2}\sum_{j=1}^{n_u}\sum_{i:r(i,j)=1}({(\theta^{(j)})^Tx^{(i)} – y^{(i,j)} })^2 + {\lambda\over 2}\sum_{j=1}^{n_u}\sum_{k=1}^n(\theta_k^{(j)})^2 \end{equation} $$
Gradient Descent
$$ \begin{equation} \left\{\begin{array}{lr} \theta_k^{(j)} := \theta_k^{(j)}-\alpha(\sum_{i:r(i,j)=1}({(\theta^{(j)})^Tx^{(i)} – y^{(i,j)} })^2x_k^{(i)} + \lambda\theta_k^{(j)}) \space \space\space\space\space \text{for k > 0}\end{array}
\right.\end{equation} $$
Collaborative Filtering
Given $x^{(1)},x^{(2)},\dots\ x^{(n_m)}$ To learn $\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)}$
$$ \begin{equation} \min_{\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)}}{1\over 2}\sum_{j=1}^{n_u}\sum_{i:r(i,j)=1}({(\theta^{(j)})^Tx^{(i)} – y^{(i,j)} })^2 + {\lambda\over 2}\sum_{j=1}^{n_u}\sum_{k=1}^n(\theta_k^{(j)})^2 \end{equation} $$
Given $\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)}$ To learn $x^{(1)},x^{(2)},\dots\ x^{(n_m)}$
$$ \begin{equation} \min_{x^{(1)},x^{(2)},\dots\ x^{(n_m)}}{1\over 2}\sum_{i=1}^{n_m}\sum_{j:r(i,j)=1}({(\theta^{(j)})^Tx^{(i)} – y^{(i,j)} })^2 + {\lambda\over 2}\sum_{i=1}^{n_m}\sum_{k=1}^n(x_k^{(i)})^2 \end{equation} $$
Minimizing $x^{(1)},x^{(2)},\dots\ x^{(n_m)}$ and $\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)}$ simutaneously
$$ \begin{equation}J={1\over 2}\sum_{(i,j):r(i,j)=1}({(\theta^{(j)})^Tx^{(i)} – y^{(i,j)} })+{\lambda\over 2}\sum_{j=1}^{n_u}\sum_{k=1}^n(\theta_k^{(j)})^2+{\lambda\over 2}\sum_{i=1}^{n_m}\sum_{k=1}^n(x_k^{(i)})^2\end{equation} $$
$$ \begin{equation}\min_{x^{(1)},x^{(2)},\dots\ x^{(n_m)},\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)}} J(x^{(1)},x^{(2)},\dots\ x^{(n_m)},\theta^{(1)},\theta^{(2)},\dots\ \theta^{(n_u)})\end{equation} $$