这步代码是什么意思?
In: # "the index" (aka "the labels")users.index
Out: Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9,10,
...
934, 935, 936, 937, 938, 939, 940, 941, 942, 943],
dtype='int64', name='user_id', length=943) 打印user对象的属性index,它是个列表,有943个元素。 suchocolate 发表于 2021-1-24 20:35
打印user对象的属性index,它是个列表,有943个元素。
为什么打印的是Int64Index([......])。这个Int64Index我在百度搜过,是64位整型索引的意思。你那个解释对于现在的我来说有点专业了{:10_250:}真不好意思,我有点笨。{:10_282:}
抱歉啦,今天帮了我这么多,我今天的每个帖子都是你帮忙解决的{:10_281:} 谁来再回复一下{:10_266:} 1476372787 发表于 2021-1-25 09:32
谁来再回复一下
发全代码 永恒的蓝色梦想 发表于 2021-1-25 09:47
发全代码
Ex3 - Getting and Knowing your Data
Check out Occupation Exercises Video Tutorial to watch a data scientist go through the exercises
This time we are going to pull data directly from the internet. Special thanks to: https://github.com/justmarkham for sharing the dataset and materials.
Step 1. Import the necessary libraries
In :
import pandas as pd
Step 2. Import the dataset from this address.
Step 3. Assign it to a variable called users and use the 'user_id' as index
In :
users = pd.read_csv('https://raw.githubusercontent.com/justmarkham/DAT8/master/data/u.user',
sep='|', index_col='user_id')
Step 4. See the first 25 entries
In :
users.head(25)
Out:
age gender occupation zip_code
user_id
1 24 M technician 85711
2 53 F other 94043
3 23 M writer 32067
4 24 M technician 43537
5 33 F other 15213
6 42 M executive 98101
7 57 M administrator 91344
8 36 M administrator 05201
9 29 M student 01002
10 53 M lawyer 90703
11 39 F other 30329
12 28 F other 06405
13 47 M educator 29206
14 45 M scientist 55106
15 49 F educator 97301
16 21 M entertainment 10309
17 30 M programmer 06355
18 35 F other 37212
19 40 M librarian 02138
20 42 F homemaker 95660
21 26 M writer 30068
22 25 M writer 40206
23 30 F artist 48197
24 21 F artist 94533
25 39 M engineer 55107
Step 5. See the last 10 entries
In :
users.tail(10)
Out:
age gender occupation zip_code
user_id
934 61 M engineer 22902
935 42 M doctor 66221
936 24 M other 32789
937 48 M educator 98072
938 38 F technician 55038
939 26 F student 33319
940 32 M administrator 02215
941 20 M student 97229
942 48 F librarian 78209
943 22 M student 77841
Step 6. What is the number of observations in the dataset?
In :
users.shape
Out:
943
Step 7. What is the number of columns in the dataset?
In :
users.shape
Out:
4
Step 8. Print the name of all the columns.
In :
users.columns
Out:
Index(['age', 'gender', 'occupation', 'zip_code'], dtype='object')
Step 9. How is the dataset indexed?
In :
# "the index" (aka "the labels")
users.index
Out:
Int64Index([1, 2, 3, 4, 5, 6, 7, 8, 9,10,
...
934, 935, 936, 937, 938, 939, 940, 941, 942, 943],
dtype='int64', name='user_id', length=943)
Step 10. What is the data type of each column?
In :
users.dtypes
Out:
age int64
gender object
occupation object
zip_code object
dtype: object
Step 11. Print only the occupation column
In :
users.occupation
#or
users['occupation']
Out:
user_id
1 technician
2 other
3 writer
4 technician
5 other
6 executive
.....写不下了,后面是最后显示的
943 student
Name: occupation, Length: 943, dtype: object 永恒的蓝色梦想 发表于 2021-1-25 09:47
发全代码
回复在审核中{:10_266:}
这是一个作业,有点长,格式比较奇怪,麻烦你将就看看吧{:10_281:}
这个是到第11步,其实后面还有,不过我只做到这
非常感谢 1476372787 发表于 2021-1-24 21:30
为什么打印的是Int64Index([......])。这个Int64Index我在百度搜过,是64位整型索引的意思。你那个解释对 ...
那是因为user.index所在的类对shell输出信息做了封装,没有你的代码,简单展示一下:>>> class A():# 定义类
def __init__(self, ls, dt):
self.ls = ls
self.dt = dt
def __repr__(self):# 定义类对象在shell中被直接调用后显示的内容
return f"DataType{self.dt}({self.ls})"
>>> ls1 =
>>> a = A(ls1, 'Int64')
>>> a
DataTypeInt64()
>>>
永恒的蓝色梦想 发表于 2021-1-25 09:47
发全代码
代码发出来了,勉强还可以看看
页:
[1]