糖逗 发表于 2020-11-19 16:42:39

Python实现PCA

参考书籍:《机器学习实战》


import numpy as np

def loadDataSet(fileName, delim = '\t'):
    fr = open(fileName)
    stringArr =
    datArr =
    return np.mat(datArr)

def pca(dataMat, topNfeat = 9999999):
    meanVals = np.mean(dataMat, axis = 0)
    meanRemoved = dataMat - meanVals
    covMat = np.cov(meanRemoved, rowvar = 0)#rowvar=0表示传入的一行表示一个样本
    eigVals, eigVects = np.linalg.eig(np.mat(covMat))
    eigValInd = np.argsort(eigVals)         
    eigValInd = eigValInd[:-(topNfeat+1):-1]
    redEigVects = eigVects[:,eigValInd]      
    lowDDataMat = meanRemoved * redEigVects#所有点在新的一组基上的值,无方向
    reconMat = (lowDDataMat * redEigVects.T) + meanVals#所有点在新的一组基上的投影,有方向的
    return lowDDataMat, reconMat


if __name__ == '__main__':
    dataMat = loadDataSet(r'C:\...\testSet.txt')
    lowData, reconMat = pca(dataMat, 1)
   
    import matplotlib.pyplot as plt
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.scatter(dataMat[:, 0].flatten().A, dataMat[:, 1].flatten().A,
               marker = '^', s = 90)
    ax.scatter(reconMat[:, 0].flatten().A, reconMat[:, 1].flatten().A,
               marker = 'o', s = 50, c = 'red')
   

糖逗 发表于 2020-11-19 16:43:17

{:10_248:}
页: [1]
查看完整版本: Python实现PCA