Hands-On: Queries with SQL, pandas, Unix commands

Retrieve the three CSV files for this practical and study their content; they provide information, respectively, on country codes, country population, and Covid cases and deaths per country (a country being identified by its ISO 3166-1 alpha-3 code).
We will be interested in the following four queries on this dataset:
- Q1. What is the total number of Covid deaths?
- Q2. What are the ten countries with the highest number of deaths, with the corresponding number?
- Q3. What are the ten countries with the highest number of deaths relative to their population, with the corresponding number in ‰?
- Q4. What are the ten countries with the highest number of deaths relative to their population within Europe, with the corresponding number in ‰?
Use sqlite3 to load this dataset into a new database (using .open, .mode csv and then .import) and write SQL queries to answer these queries. Note the time needed for each query (using .timer on).
Add indexes (using CREATE INDEX) on relevant columns. Rerun the queries and check the timings.
Use pandas to load the dataset in main memory and run the same queries. Using Python's standard time module, note the time taken by each query.
Use Unix commands to process the dataset on disk and run the same queries. Use the time command as a prefix to your other commands to note the time taken by each query.