机器学习中的常用距离-阿里云开发者社区

机器学习中的常用距离

2017-04-02 1810

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： If x1,x2∈Rnx_{1}, x_{2}\in\mathbb{R}^{n}, then: 闵可夫斯基距离 Minkowski Distance d12=∑k=1n(x1k−x2k)p−−−−−−−−−−−−√p,p>0d_{12}=\sqrt[\uproot{16}p]{\sum_{k=1}^{n}(x_{1k}-x_{2k})^{p}},\quad p>0欧

If x1,x2∈Rn, then:
闵可夫斯基距离 Minkowski Distance

d 12 = \sum k = 1 n (x 1 k - x 2 k) p - - - - - - - - - - - - \sqrt p, p > 0

欧氏距离 Enclidean Distance
L2 norm

d 12 = \sum k = 1 n (x 1 k - x 2 k) 2 - - - - - - - - - - - - \sqrt or d 12 = (x 1 - x 2) T (x 1 - x 2) - - - - - - - - - - - - - - - - \sqrt

标准化欧式距离/加权欧式距离 Weighted Euclidean Distance

d 12 = \sum k = 1 n (x 1 k - x 2 k S k) 2 - - - - - - - - - - - - - -  ⎷  

where

Sk is the standard deviation.

from numpy import *
vectormat=mat([[1,2,3],[4,5,6]])
v12=vectormat[0]-vectormat[1]
varmat=std(vectormat.T, axis=0)
normmat=(vectormat-mean(vectormat))/varmat.T
normv12=normmat[0]-normmat[1]
print(sqrt(normv12*normv12.T))

曼哈顿距离 Manhattan Distance
L1 norm

d 12 = \sum k = 1 n | x 1 k - x 2 k |

切比雪夫距离 Chebyshev Distance
L∞ norm

d 12 = max i (| x 1 i - x 2 i |)

from numpy import *
vector1=mat([1,2,3])
vector2=mat([4,5,7])
print(abs(vector1-vector2).max())

夹角余弦 Cosine

cos θ = \sum n k = 1 x 1 k x 2 k \sum n k = 1 x 2 1 k - - - - - - - - \sqrt \sum n k = 1 x 2 2 k - - - - - - - - \sqrt

汉明距离 Hamming Distance
In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other. (referred from Wikipedia)

from numpy import *
matV=mat([[1,1,0,1,0,1,0,0,1],[0,1,1,0,0,0,1,1,1]])
smstr=nonzero(matV [0]-matV[1])
print(shape(smstr[0])[0])

杰卡德相似系数 Jaccard Similarity Coefficient
Given two sets, A and B, the Jaccard similarity coefficient is defined as

J (A, B) = | A \cap B | | A \cup B |

杰卡德距离 Jaccard Distance

J δ (A, B) = 1 - J (A, B) = | A \cup B | - | A \cap B | | A \cup B |

from numpy import *
import scipy.spatial.distance as dist
matV=mat([[1,1,0,1,0,1,0,0,1],[0,1,1,0,0,0,1,1,1]])
print(dist.pdist(matV,'jaccard'))

马氏距离 Mahalanobis Distance
Given m sample vectors X1,…,Xm whose mean value is μ and covariance matrix is S, then the Mahalanobis distance of sample vector X and μ is defined as

D (X) = (X - μ) T S - 1 (X - μ) - - - - - - - - - - - - - - - - - \sqrt

that of sample vector

Xi and

Xj is

D (X) = (X i - X j) T S - 1 (X i - X j) - - - - - - - - - - - - - - - - - - - - \sqrt

机器学习中的常用距离

热门文章

最新文章

相关课程

相关电子书

相关实验场景