This project provides a simple Python script to display the data for the most populous cities in the world using the pandas library. It also includes optional visualization of the data using Matplotlib, along with unit tests to ensure the correctness of the data representation.
The data includes information about the top ten most populous cities as of 2023, detailing the city name, country, and population in millions.
Ensure you have Python installed on your system. You will also need the following Python packages:
- pandas
- matplotlib (for optional visualization)
- unittest (for running tests; part of Python's standard library)
You can install these packages using pip:
pip install pandas matplotlib
The main script is designed to create and display a DataFrame containing the city data. Here's the core functionality:
import pandas as pd
import matplotlib.pyplot as plt
# Data for the most populous cities
cities_data = {
'City': ["Tokyo", "Delhi", "Shanghai", "Dhaka", "São Paulo", "Mexico City", "Cairo", "Beijing", "Mumbai", "Osaka"],
'Country': ["Japan", "India", "China", "Bangladesh", "Brazil", "Mexico", "Egypt", "China", "India", "Japan"],
'Population (Millions)': [37.2, 32.9, 24.8, 23.2, 22.6, 22.3, 22.2, 21.9, 20.7, 19.1]
}
# Create a DataFrame
cities_df = pd.DataFrame(cities_data)
# Display the DataFrame
print(cities_df)
# Optional: Visualize the data
plt.figure(figsize=(10, 6))
plt.bar(cities_df['City'], cities_df['Population (Millions)'], color='skyblue')
plt.title('Most Populous Cities in the World')
plt.xlabel('City')
plt.ylabel('Population (Millions)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
- Save the script in a file named
cities_population.py
. - Run the script using Python:
python cities_population.py
This will print the DataFrame to the console and display a bar chart of the population data if matplotlib is installed.
The project includes a set of unit tests to verify the correctness of the DataFrame. The tests check the structure, number of rows, and specific data values.
- Save the test code in a file named
test_cities.py
. - Execute the tests using the following command:
python -m unittest test_cities.py
The tests will validate the DataFrame's integrity and provide feedback on its correctness.
This project provides an easy way to display and verify data about the most populous cities in the world using Python. The code is straightforward and can be extended for further data analysis or visualization tasks.