The goal of this session is to end up with a script that computes some simple statistics from Meteo open data files. The file was modified and reduced for this exercice (just 1 station with data in just one year : 2016). In future, you can download other data here : https://public.opendatasoft.com/explore/dataset/donnees-synop-essentielles-omm/export/
The file contains lines of the form:
ID OMM station,Date,Average wind 10 mn,Temperature,Humidity,Rainfall 3 last hours,Station
7761,2016-01-01T01:00:00+01:00,2.0,283.75,94,0.2,AJACCIO 7761,2016-01-01T04:00:00+01:00,2.2,283.95,91,0.2,AJACCIO 7761,2016-01-01T07:00:00+01:00,1.7,284.05,88,0.2,AJACCIO 7761,2016-01-01T10:00:00+01:00,1.6,287.05,75,0.2,AJACCIO 7761,2016-01-01T13:00:00+01:00,3.1,289.55,73,0.0,AJACCIO
This is a classic csv file with separated data by "," The first line is the header.
We want to compute, some statistics for this station
The temperature measurement is in kelvin (273,15 K $\leftrightarrow$ 0 °C)
Write a script with a function load_data()
that
Load in the data in a single dictionnary of this structure:
{'Date': [wind,temperature,humidity,rainfall]}
For example
{
'2016-01-01T01': [2.0,283.75,94,0.2],
'2016-01-01T04': [2.2,283.95,91,0.2]
}
In this case, we can consider YYYY-MM-DDTHH as the key for the station dictionary.
Split each line and extract data.
You can use the method split from the str class.
s = "I am lucky"
l = s.split()
print(l)
['I', 'am', 'lucky']
Load the data in multiple dictionnaries or lists (one per field).
You can use the following structure
wind = {'Date1': wind_value1, 'Date2: wind_value2, ...}
temperature = {'Date1': temperature1, 'Date2: temperature2, ...}
...
You can use the following structure
dates = ['Date1', 'Date2', ...]
wind = [wind_value1, wind_value2, ...]
temperature = [temperature1, temperature2, ...]
...
Write 2 functions get_max_temperature()
and get_average_temperature()
that:
Write 1 function get_sum_rainfall()
that sum the rainfall.
Be careful, some measurement have no rainfall data.
Write 1 function get_hours_humidity(rate)
To do such data analysis, ones should not use pure Python code without external library! The library Pandas has been written to do this in few lines:
import pandas
df = pandas.read_csv(
'../TP/TP1_MeteoData/data/synop-2016.csv', sep=',', header=0)
# print(df.columns)
# print(df.age)
# print(df['Temperature'])
# print(df[(df['Station']=='AJACCIO')])
# print(df[(df['Station'] == 'AJACCIO')]['Rainfall 3 last hours'].sum())
temp = df[(df['Station'] == 'AJACCIO')]['Temperature'].mean()-273.15
print(f'The average of temperature at Ajaccio is {temp:.1f} °C.')
The average of temperature at Ajaccio is 16.3 °C.