Andrew Ng机器学习公开课笔记 -- Generalized Linear Models

0
0
0
1. 云栖社区>
2. 博客>
3. 正文

## Andrew Ng机器学习公开课笔记 -- Generalized Linear Models

notes，http://cs229.stanford.edu/notes/cs229-notes1.pdf

The exponential family

The exponential families include many of the most common distributions, including the normalexponential,gammachi-squaredbetaDirichletBernoullicategoricalPoissonWishartInverse Wishart and many others.

η is called the natural parameter (also called the canonical parameter) of the distribution
T(y) is the sufficient statistic (for the distributions we consider, it will often be the case that T(y) = y)
a(η) is the log partition function. The quantity  essentially plays the role of a normalization constant, that makes sure the distribution p(y; η) sums/integrates over y to 1.

，即

，因为这个参数对回归结果没有影响，设什么都行，1更方便点

Constructing GLMs

To derive a GLM for this problem, we will make the following three assumptions about the conditional distribution of y given x and about our
model:
1. y | x; θ ∼ ExponentialFamily(η). I.e., given x and θ, the distribution of
y follows some exponential family distribution, with parameter η.

2. Given x, our goal is to predict the expected value of T(y) given x.
In most of our examples, we will have T(y) = y, so this means we would like the prediction h(x) output by our learned hypothesis h to satisfy h(x) = E[y|x]. (Note that this assumption is satisfied in the choices for h(x) for both logistic regression and linear regression. For instance, in logistic regression, we had h(x) = p(y = 1|x; θ) = 0 · p(y = 0|x; θ) + 1 · p(y = 1|x; θ) = E[y|x; θ].)

3. The natural parameter η and the inputs x are related linearly:
(Or, if η is vector-valued, then  )

h(x) = E[y|x; θ]   //根据假设2
= μ             //对于高斯分布，分布的期望为μ
= η             //在由高斯分布转换到指数族分布时，得到μ=η
=         //根据假设3

Logistic回归
h(x) = E[y|x; θ]
= φ
=
=

Softmax回归
Logistic回归可以解决二元分类问题，但是对于多元分类，就需要使用Softmax回归来解决，比如对于邮件不是仅仅分为spam和not-spam，而是分为spam，personal，work

φi = p(y = i; φ)

(1{True} = 1, 1{False} = 0). For example, 1{2 = 3} = 0, and 1{3 =
5 − 2} = 1.

，比较好理解，只有y=i时，才是1，其他都是0

，因为求期望时，其他项都是×0，只会留下这项

，根据GLMs第三假设，

+ 关注