线性回归 (Linear Regreesion)

黎浩然/ 2 11 月, 2023/ 机器学习/MACHINELEARNING, 研究生/POSTGRADUATE/ 0 comments

Hypothesis

具有 $m$ 个样本(samples)与 $n$ 个特征(features)的线性回归的hypothesis为：

\begin{equation}h_\theta(x)=\theta^Tx=\theta_0+\theta_1x_1+\theta_2x_2+\cdots+\theta_nx_n\end{equation}

Cost function

$$ \begin{equation}J(\theta)=J(\theta_0,\theta_1,…,\theta_n)={1\over2m}\sum_{i=1}^m{(h_{\theta}(x^{(i)})-y^{(i)})}^2\end{equation} $$

Feature Scaling

Get every feature into approximately a $-1 \leq x_i \leq 1$ range.

Feature scaling – Wikipedia

Gradient Descent

$$ \begin{equation}{\partial J(\theta) \over \partial\theta_j} = {1\over m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_j\end{equation} $$

$$ \begin{equation}\theta_j := \theta_j – \alpha{1\over m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_j\end{equation} $$

使用上述公式同时更新$\theta_0,\theta_1,\cdots,\theta_n$，其中$x^{(i)}_0=1$

Learning Rate

If $\alpha$ is too small: slow convergence; If $\alpha$ is too large: may not convegence

正规方程(Normal Equation)

如果

$$ \begin{equation}X = \begin{bmatrix} x_0^{(1)} & x_1^{(1)} & \cdots & x_n^{(1)} \\ x_0^{(2)} & x_1^{(2)} & \cdots & x_n^{(2)} \\ \vdots & \vdots & & \vdots\\ x_0^{(m)} & x_1^{(m)} & \cdots & x_n^{(m)} \end{bmatrix} = \begin{bmatrix} 1 & x_1^{(1)} & \cdots & x_n^{(1)} \\ 1 & x_1^{(2)} & \cdots & x_n^{(2)} \\ \vdots & \vdots & & \vdots\\ 1 & x_1^{(m)} & \cdots & x_n^{(m)} \end{bmatrix}\end{equation} $$

$$ \begin{equation}y=\begin{bmatrix} y^{(1)}\\y^{(2)}\\\vdots\\y^{(m)}\end{bmatrix}\end{equation} $$

那么：

$$ \begin{equation}\theta=\begin{bmatrix} \theta_0\\\theta_1\\\vdots\\\theta_n\end{bmatrix}=(X^TX)^{-1}X^Ty\end{equation} $$

黎浩然的编程小屋

Ernest's Hub