|
马上注册,结交更多好友,享用更多功能^_^
您需要 登录 才可以下载或查看,没有账号?立即注册
x
# Q3. Coronavirus cases
---
## Learning objectives
* Perform simple analysis on real data via programming using what we have learned from week 3, including:
* Extracting information from string using indexing and slicing
* Working on `list`, including modifying it
* Realise while we can use built-in Python containers like `list` to analyse data, it is not as convenient as one would hope
---
## Background information
In this question, we analyse the coronavirus data from [WHO](covid19.who.int). The data is cleaned and saved in [data/coronavirus_cases.csv](data/coronavirus_cases.csv). The CSV file contains the daily coronavirus cases for G7 countries: Canada, France, Germany, Italy, Japan, the United Kingdom and the United States from the 3rd January 2020 to the 30th September 2021.
---
## Instructions
1. Have a look at the lines of code provided in the [`src/coronavirus_cases.py`](src/coronavirus_cases.py) file, for which the code does the following:
* Load in data from [data/coronavirus_cases.csv](data/coronavirus_cases.csv) using `open()`
* For each line of the CSV file:
* Split the line into a list by separating values in the line by commas (via `.split(',')`), with the first value representing a day, and the next 7 values representing the number of cases in that day for the G7 countries
* Cast the number of cases from `str` to `int` using `int()`
* Add the list created into `cases` using `append()`
**Note**
* You are not required to fully understand the given code, but you should have some idea what it does
* You do NOT need to (and should not) write code to load the data by yourself. It has already been done for you
2. Run the whole [`src/coronavirus_cases.py`](src/coronavirus_cases.py) file in `Spyder` and understand the object that the variable `cases` binds to:
* `cases` is a `list` of `list` and can be considered as a table with `637` rows and `8` columns and looks like the following:
```
[['2020-01-03',0,0,0,0,0,0,0],
...
['2021-01-01',7476,13186,22924,23477,4091,70797,237337],
['2021-01-02',8420,2711,12690,22210,3617,52783,230749],
...
['2021-09-29',4279,5859,11780,2962,1570,34520,94879],
['2021-09-30',3491,5479,12150,3212,2005,35059,107399]]
```
* For each "row" (or a `list` inside `cases`):
* The first element is a `str` storing the day information
* The next 7 elements are `int` storing the number of new cases for each country (Canada, France, Germany, Italy, Japan, the United Kingdom, the United States) on the corresponding day
If you are not sure how to run the whole file to create `cases`, please watch the video on Moodle under "Coursework" -> "Hints and visualisation", which demonstrates how to do it.
3. Create the required variables below by writing your code in [`src/coronavirus_cases.py`](src/coronavirus_cases.py):
* a. Calculate the total number of cases for all 7 countries for each day and bind it to the variable `g7_cases` which is a `list` of `int` with length `637`. `g7_cases` should look something like: ```
[0, 0, ..., 155849, 168795]
```
* b. Calculate the total number of cases for each country in the first 6 months of 2021 and bind it in the variable `cases_2021_first_half` which is a `list` of `list`. For each row, the first element is a country name (`str`) and the second element is the total number of cases in the first half of 2021 (`int`) for that country. Please use the countries names 'Canada', 'France', 'Germany', 'Italy', 'Japan', 'United Kingdom', 'United States'. `cases_2021_first_half` should look like:
```
[['Canada', xxx],
['France', yyy],
...
['United States', zzz]]
```
where `xxx`, `yyy`, `zzz` are the actual number of cases.
* c. Using the variable `cases` from part (2), extract the column for Japan. Store it in the variable `japan_cases`, which is a `list` of `int` with length `637`. `japan_cases` should look like this:
```
[0, ..., ..., 1570, 2005]
```
---
## Requirements
* Please make good use of comments to make it clear which block of code is for which part of the question. Alternatively, you can use functions to structure your code
* Make sure you get the results by the Python code, not eye-balling or from other tools like Excel
---
## Note
* The autograder checks your code by checking the value of the variables `g7_cases`, `cases_2021_first_half` and `japan_cases`
* In weeks 7 and 8, we will introduce `NumPy` and `Pandas`, which makes the above computation very simple
|
|