波士顿住宅街数据模型问题,Python交流,编程语言专区,鱼C论坛

余小c真的很强 发表于 2020-6-3 18:55:25

波士顿住宅街数据模型问题

从网上找的机器学习例题
#boston住宅数据读入
from sklearn.datasets import load_boston
boston = load_boston()
print(boston.DESCR)
这是输出：

Boston House Prices dataset
===========================

Notes
------
Data Set Characteristics:

:Number of Instances: 506

:Number of Attributes: 13 numeric/categorical predictive

:Median Value (attribute 14) is usually the target

:Attribute Information (in order):
   - CRIM per capita crime rate by town
   - ZN    proportion of residential land zoned for lots over 25,000 sq.ft.
   - INDUS proportion of non-retail business acres per town
   - CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
   - NOX    nitric oxides concentration (parts per 10 million)
   - RM    average number of rooms per dwelling
   - AGE    proportion of owner-occupied units built prior to 1940
   - DIS    weighted distances to five Boston employment centres
   - RAD    index of accessibility to radial highways
   - TAX    full-value property-tax rate per $10,000
   - PTRATIOpupil-teacher ratio by town
   - B    1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
   - LSTAT % lower status of the population
   - MEDV Median value of owner-occupied homes in $1000's

:Missing Attribute Values: None

:Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.
http://archive.ics.uci.edu/ml/datasets/Housing

This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic
prices and the demand for clean air', J. Environ. Economics & Management,
vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch, 'Regression diagnostics
...', Wiley, 1980. N.B. Various transformations are used in the table on
pages 244-261 of the latter.

The Boston house-price data has been used in many machine learning papers that address regression
problems.

**References**

- Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data and Sources of Collinearity', Wiley, 1980. 244-261.
- Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.
- many more! (see http://archive.ics.uci.edu/ml/datasets/Housing)

#用pandas模块的dataframe读入boston住宅街的数据
import pandas as pd
df = pd.DataFrame(boston.data,columns=boston.feature_naMSE)
df['MEDV'] = boston.target#目标变量读入
x = df.RM.to_frame()
y = df.MEDV

报错信息：
KeyError                               Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\__init__.py in __getattr__(self, key)
60       try:
---> 61          return self
62       except KeyError:

KeyError: 'feature_naMSE'

During handling of the above exception, another exception occurred:

AttributeError                         Traceback (most recent call last)
<ipython-input-11-1277f379165e> in <module>()
   1 import pandas as pd
----> 2 df = pd.DataFrame(boston.data,columns=boston.feature_naMSE)
   3 df['MEDV'] = boston.target#目标变量读入
   4 x = df.RM.to_frame()
   5 y = df.MEDV

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\__init__.py in __getattr__(self, key)
61          return self
62       except KeyError:
---> 63          raise AttributeError(key)
64
65 def __setstate__(self, state):

AttributeError: feature_naMSE
新手上路，求大佬指教，另外有什么好的机器学习方法吗

余小c真的很强 发表于 2020-6-4 17:27:50

听过一天的自学最终找到了问题所在import pandas as pd
df = pd.DataFrame(boston.data,columns=boston.feature_naMSE)
df['MEDV'] = boston.target#目标变量读入
x = df.RM.to_frame()
y = df.MEDV
其中的feature_naMSE应该是key中的feature_name{:10_250:}{:10_250:}{:10_250:}
自学还是有好处的

java2python 发表于 2020-6-4 17:32:18

高大上

页: [1]

鱼C论坛's Archiver

波士顿住宅街数据模型问题