机器学习-白板推导 2 高斯分布
机器学习-白板推导 2 高斯分布1. 从概率密度函数看高斯分布高斯分布的pdf:
P(\mathbf{x}) = \frac{1}{(2\pi)^{\frac{p}{2}}|\Sigma|^{\frac{1}{2}}}\operatorname{ exp }\left\{ -\frac{1}{2}(\mathbf{x}-\mathbf{\mu})^T\Sigma^{-1}(\mathbf{x}-\mathbf{\mu}) \right\}后面式子-\frac{1}{2}(\mathbf{x}-\mathbf{\mu})^T\Sigma^{-1}(\mathbf{x}-\mathbf{\mu})是二次型,其中\mathbf{x} \in \mathbb{R}^p,是p维随机变量。
\begin{equation}
\mathbf{x}=
\begin{pmatrix}
x_1 \\
x_2 \\
\vdots \\
x_p
\end{pmatrix} \qquad
\mathbf{u}= ...
机器学习-白板推导 1 概率基础
机器学习-白板推导 1 概率基础1. MLE一元高斯分布:
P(x) = {\frac {1}{ {\sqrt {2\pi }\sigma}}}\operatorname{ exp } \left(-{\frac {\left(x-\mu \right)^{2}}{2\sigma ^{2}}}\right) \tag{1}多元高斯分布:
P(X) = \frac{1}{(2\pi)^{\frac{p}{2}}|\Sigma|^{\frac{1}{2}}}\operatorname{ exp }\left\{ -\frac{1}{2}(X-\mu)^T\Sigma^{-1}(X-\mu) \right\} \tag{2}用极大似然估计参数\theta 有:
\begin{align}
L(\theta) = \operatorname{ \log } P(x | \theta) = &\operatorname{ \log } \prod_{i=1}^N P(x_i|\theta)=\sum_{i=1}^N \operatorname{ \log } P(x_i|\th ...
Lecture 11 PageRank and Ridge Regression
Lecture 11: PageRank and Ridge Regression1. PageRank
Google PageRank
Google原始论文
Imagine surfing web by randomly clicking links. This is called a “random walk” over graph. If you do this long enough eventually reach “steady state” where Probability that you are at site i = \pi_i
让$\tilde A$ 表示这个图的邻近矩阵,其中:
\tilde A = \begin{cases}
1, & \text{if links from j to i},\\
0, & otherwise
\end{cases}\tag{1}具体地有:
\tilde A =
\left[\begin{array}{r}
0 & 1 & 1 & 1 \\
0 & 0 & 1 ...
Lecture 10 More on the SVD in Machine Learning
Lecture 10: More on the SVD in Machine Learning including matrix completion1. SVD in machine learning
Dimensionality Reduction(PCA)
Principal Components Regression
Least Squares
Matrix Completion
PageRank
2. Least Squares & SVD
跟前面提到的一样有:
假设$n \ge p$ & $X$有$p$个线性无关的列,那么
\hat w = (X^TX)^{-1}X^Ty \tag{1}情形1:如果$X= U \Sigma V^T$,其中$U : n\times n ,\ \ \Sigma: n \times p ,\ \ V^T: p \times p$, 那么
(X^TX)^{-1}X^T = (V \Sigma^T U^T U \Sigma V^T)^{-1}V\Sigma^TU^T=(V \Sigma^T \Sigma V^T)^{-1}V ...
Lecture 9 The SVD in Machine Learning
Lecture 9 The SVD in Machine Learning
矩阵表示
X=
\left[\begin{array}{r}
1 & -1 & -1 & 1 \\
-1 & 1 & -1 & 1 \\
1 & -1 & -1 & 1 \\
-1 & 1 & -1 & 1 \\
1 & -1 & 0 & 0
\end{array} \right]= \left[\begin{array}{r}
1 & -1 \\
-1 & 1 \\
1 & -1 \\
-1 & 1 \\
1 & -1
\end{array} \right]\left[\begin{array}{r}
1 & -1 &0&0 \\
0 & 0&-1&1 \\
\end{array} \right]\tag{1}其实是利用从列向量乘法来分解的,$X$第一列是由:
X[1]=\left[\begin{array}{r}
1 \\
-1 \\
1 \\
-1 \\
1
\end{array} \ ...
Lecture 8 The Singular Value Decomposition
Lecture 8 The Singular Value Decomposition1. SVD$X \in \mathbb{R}^{n \times p}$有SVD $U\Sigma V^T$,并且满足:
$U \in \mathbb{R}^{n \times n}$是正交的$UU^T=U^TU=I$ [ $U$的列=左边奇异向量]
$V \in \mathbb{R}^{p \times p}$是正交的$VVT=V^TV=I$ [ $V$的列=右边奇异向量]
$\Sigma \in \mathbb{R}^{n \times p}$是对角阵,并且对角元素满足$\sigma_1 \ge \sigma_2 \ge \dots \ge \sigma_p$。
SVD示意图
12345678910111213141516171819202122232425262728293031323334353637383940414243444546import numpy as npA = np.mat([[10, 0, 0], [0, 5, 0], [0, 0 ...
Lecture 7 Introduction to the Singular Value Decomposition
Lecture 7 Introduction to the Singular Value Decomposition1. 问题引入和投影矩阵的性质Goal:对于观测值$X_1, X_2,\cdots, X_p \in \mathbb{R}^n$,找到一个一维子空间(可以理解为一根直线)能”最好的拟合数据”
Solution:
\text{ 把每一个 }X_i \text{ 投影到 } \overrightarrow a上, Proj^{X_i}_{\overrightarrow a} \ , \text{ 投影到让距离 } d^2_i=\lVert X_i- Proj^{X_i}_{\overrightarrow a}\rVert^2_{2} 和最小。复习Projection Matrices:
If $A \in \mathbb{R}^{n \times p}$张成的子空间,那么Projection of $X$ onto $span(cols(A))=Proj_A X$,如果$A$的每列都是线性无关的,并且有:
Proj_A X = A(A^TA)^{-1}X \tag{1 ...
Lecture 6 Finding Orthogonal Bases
Lecture 6 Finding Orthogonal Bases1.怎么求得U?
Gram Schemidt Orthogonalization
Singular Value Decomposition
有$X$怎么找到$U$?
残差 X^{\prime}_2=X_2-aU_1 ,所以 U_2=\frac {X^{\prime}_2} {\lVert X^{\prime}_2 \rVert^2_{2}}=\left[\begin{array}{c}
0 \\
b
\end{array}\right]/b=\left[\begin{array}{c}
0 \\
1
\end{array}\right]
2. 斯密特正交化的几何示例
\begin{align}
&X^{\prime}_2=X_2-Projection(X_2 \ onto \ U_1) \ , \text{best fit of } X_2 \text{ as weighted } U_1 。
\\ &\text{best fit} = U_1w = U_1U^{T}_1X_2=P_{U_1}(X ...
Lecture 5 Subspaces, Bases, and Projections
Lecture 5 Subspaces, Bases, and Projections1.回想最小二乘的几何意义引入span
上面的平面可有$X$的列向量的张成空间$X$。
span(cols(X)) = { v \in \mathbb{R}^n \ v=w_1X_1 + w_2X_2+\cdots+ w_pXp \quad for \ some \ w_1, w_2, \cdots, w_p }2. subspace
If the cols of $X$ are Linearly Independent, $X$ is a subspace. Then they form a basis for $X$.
上上图中的(绿色点)$\hat {\underline y}$是$\underline{y}$在$X$上的投影。
vertical plane
horizontal plane
子空间永远包含原点或者说$\overrightarrow{0}$。
3. 怎么表示一个子空间?
用一组向量的集合作为一个张成子空间
用一组线性无关的向量张成子空间,这组向量叫一组基
用一组正交 ...
6. 基本句法
6. 基本句法1.基本结构
顺序语句
分支语句
循环语句
1. if, switch 语句12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849#include <iostream>using namespace std;bool isLeapYear(unsigned int year){ if ((year % 4 == 0 && year % 100 !=0) || (year % 400 == 0)){ return true; } else { return false; }}typedef enum _COLOR{ RED, GREEN, BLUE, UNKNOWN}color;int main(){ cout << isLeapYear(2000 ...