鱼C论坛

 找回密码
 立即注册
查看: 655|回复: 3

问题求助

[复制链接]
发表于 2023-10-19 02:47:10 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
# Q3. Coronavirus cases

---

## Learning objectives

* Perform simple analysis on real data via programming using what we have learned from week 3, including:
  * Extracting information from string using indexing and slicing
  * Working on `list`, including modifying it
* Realise while we can use built-in Python containers like `list` to analyse data, it is not as convenient as one would hope

---

## Background information

In this question, we analyse the coronavirus data from [WHO](covid19.who.int). The data is cleaned and saved in [data/coronavirus_cases.csv](data/coronavirus_cases.csv). The CSV file contains the daily coronavirus cases for G7 countries: Canada, France, Germany, Italy, Japan, the United Kingdom and the United States from the 3rd January 2020 to the 30th September 2021.

---

## Instructions

1. Have a look at the lines of code provided in the [`src/coronavirus_cases.py`](src/coronavirus_cases.py) file, for which the code does the following:

  * Load in data from [data/coronavirus_cases.csv](data/coronavirus_cases.csv) using `open()`
  * For each line of the CSV file:
    * Split the line into a list by separating values in the line by commas (via `.split(',')`), with the first value representing a day, and the next 7 values representing the number of cases in that day for the G7 countries
    * Cast the number of cases from `str` to `int` using `int()`
    * Add the list created into `cases` using `append()`

    **Note**
    * You are not required to fully understand the given code, but you should have some idea what it does
    * You do NOT need to (and should not) write code to load the data by yourself. It has already been done for you

2. Run the whole [`src/coronavirus_cases.py`](src/coronavirus_cases.py) file in `Spyder` and understand the object that the variable `cases` binds to:

  * `cases` is a `list` of `list` and can be considered as a table with `637` rows and `8` columns and looks like the following:
    ```
    [['2020-01-03',0,0,0,0,0,0,0],
    ...
    ['2021-01-01',7476,13186,22924,23477,4091,70797,237337],
    ['2021-01-02',8420,2711,12690,22210,3617,52783,230749],
    ...
    ['2021-09-29',4279,5859,11780,2962,1570,34520,94879],
    ['2021-09-30',3491,5479,12150,3212,2005,35059,107399]]
    ```
    * For each "row" (or a `list` inside `cases`):
      * The first element is a `str` storing the day information
      * The next 7 elements are `int` storing the number of new cases for each country (Canada, France, Germany, Italy, Japan, the United Kingdom, the United States) on the corresponding day

    If you are not sure how to run the whole file to create `cases`, please watch the video on Moodle under "Coursework" -> "Hints and visualisation", which demonstrates how to do it.

3. Create the required variables below by writing your code in [`src/coronavirus_cases.py`](src/coronavirus_cases.py):

  * a. Calculate the total number of cases for all 7 countries for each day and bind it to the variable `g7_cases` which is a `list` of `int` with length `637`. `g7_cases` should look something like:  ```
  [0, 0, ..., 155849, 168795]
  ```

  * b. Calculate the total number of cases for each country in the first 6 months of 2021 and bind it in the variable `cases_2021_first_half` which is a `list` of `list`. For each row, the first element is a country name (`str`) and the second element is the total number of cases in the first half of 2021 (`int`) for that country. Please use the countries names 'Canada', 'France', 'Germany', 'Italy', 'Japan', 'United Kingdom', 'United States'. `cases_2021_first_half` should look like:

  ```
  [['Canada', xxx],
  ['France', yyy],
  ...
  ['United States', zzz]]
  ```

  where `xxx`, `yyy`, `zzz` are the actual number of cases.

  * c. Using the variable `cases` from part (2), extract the column for Japan. Store it in the variable `japan_cases`, which is a `list` of `int` with length `637`. `japan_cases` should look like this:

  ```
  [0, ..., ..., 1570, 2005]
  ```

---

## Requirements

* Please make good use of comments to make it clear which block of code is for which part of the question. Alternatively, you can use functions to structure your code
* Make sure you get the results by the Python code, not eye-balling or from other tools like Excel

---

## Note

* The autograder checks your code by checking the value of the variables `g7_cases`, `cases_2021_first_half` and `japan_cases`
* In weeks 7 and 8, we will introduce `NumPy` and `Pandas`, which makes the above computation very simple

想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复

使用道具 举报

发表于 2023-10-19 02:47:31 | 显示全部楼层
a. 计算每天7个国家的总病例数,并将其绑定到变量 g7_cases 。 g7_cases 是一个 list ,包含 637 个 int 元素。下面是 g7_cases 的示例:
[0, 0, ..., 155849, 168795]

解决方案:
g7_cases = []

for row in cases:

    total_cases = sum(row[1:])  # 计算每天7个国家的总病例数

    g7_cases.append(total_cases)  # 将结果添加到g7_cases中


b. 计算2021年上半年每个国家的总病例数,并将其绑定到变量 cases_2021_first_half 。 cases_2021_first_half 是一个 list ,每个元素是一个包含国家名称和上半年总病例数的 list 。请使用国家名称'Canada'、'France'、'Germany'、'Italy'、'Japan'、'United Kingdom'、'United States'。 cases_2021_first_half 应该如下所示:
[['Canada', xxx],

['France', yyy],

...

['United States', zzz]]

其中, xxx 、 yyy 、 zzz 是实际的病例数。

解决方案:
cases_2021_first_half = []

for row in cases:

    date = row[0]  # 获取日期

    if '2021' in date and int(date.split('-')[1]) <= 6:  # 判断是否是2021年上半年

        country_cases = []

        country_cases.append('Canada')  # 添加国家名称

        country_cases.append(sum(row[1:8]))  # 计算总病例数

        cases_2021_first_half.append(country_cases)  # 将结果添加到cases_2021_first_half中


c. 使用变量 cases 中的数据,提取日本的病例数列,并将其存储在变量 japan_cases 中。 japan_cases 是一个 list ,包含 637 个 int 元素。 japan_cases 应该如下所示:
[0, ..., ..., 1570, 2005]

解决方案:
japan_cases = [row[5] for row in cases]  # 提取日本的病例数列
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2023-10-19 02:47:58 | 显示全部楼层
回答

根据题目的要求,我们需要完成以下几个任务:

a. 计算每天7个国家的总病例数,并将结果存储在变量g7_cases中。
b. 计算2021年上半年每个国家的总病例数,并将结果存储在变量cases_2021_first_half中。
c. 提取日本的病例数据,并将结果存储在变量japan_cases中。

下面是我的解决方案:
# 任务 ag7_cases = []for row in cases:    total_cases = sum(row[1:])  # 计算每一行的总病例数    g7_cases.append(total_cases)  # 将结果添加到g7_cases列表中# 任务 bcases_2021_first_half = []for row in cases:    date = row[0]  # 获取日期信息    if '2021' in date and int(date.split('-')[1]) <= 6:  # 判断是否为2021年上半年的数据        country_cases = [row[1], row[2], row[3], row[4], row[5], row[6], row[7]]  # 提取国家的病例数        cases_2021_first_half.append(country_cases)  # 将结果添加到cases_2021_first_half列表中# 任务 cjapan_cases = [row[4] for row in cases]  # 提取第5列(日本)的病例数
完成上述代码后,变量g7_cases将包含每天7个国家的总病例数,变量cases_2021_first_half将包含2021年上半年各个国家的总病例数,变量japan_cases将包含日本每天的病例数。

希望对你有帮助!如果有任何问题,请随时提问。

球一个最佳答案谢谢啦!这对我非常重要!
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2023-10-19 02:48:05 | 显示全部楼层
This question is asking you to perform some data analysis on a CSV file containing daily coronavirus cases for G7 countries from January 3, 2020, to September 30, 2021. The CSV file is already loaded into a list of lists called cases by the provided code. Each inner list represents a day and contains the date and the number of new cases for each of the G7 countries on that day.

Here is a Python code snippet that should help you complete this task:
# Part a
g7_cases = [sum(day[1:]) for day in cases]

# Part b
countries = ['Canada', 'France', 'Germany', 'Italy', 'Japan', 'United Kingdom', 'United States']
cases_2021_first_half = []
for i in range(7):
    total_cases = sum(day[i+1] for day in cases if '2021-01' <= day[0] <= '2021-06')
    cases_2021_first_half.append([countries[ i], total_cases])

# Part c
japan_cases = [day[5] for day in cases]

In part a, we're using a list comprehension to sum the number of cases for all countries for each day. The sum(day[1:]) part sums all elements in the list starting from the second element (index 1), which are the case numbers for the G7 countries.

In part b, we're creating a list of lists where each inner list contains a country name and the total number of cases for that country in the first half of 2021. We're using a for loop to iterate over the indices of the countries in the cases list. For each country, we sum the number of cases for days in the first half of 2021. The condition '2021-01' <= day <= '2021-06' checks if the date is within the first half of 2021.

In part c, we're creating a list of the number of cases in Japan for each day. The number of cases in Japan is the sixth element (index 5) in each inner list in cases.
请注意,本回答来自GPT-4
如果问题已经解决,请设置最佳答案
想知道小甲鱼最近在做啥?请访问 -> ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2024-11-11 04:46

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表