# Python training UGA 2017¶

A training to acquire strong basis in Python to use it efficiently

Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)

## Practical session 1¶

File parsing and dictionary usage

# Goal¶

The goal of this session is to end up with a script that computes some simple statistics from Meteo open data files. The file was modified and reduced for this exercice (just 1 station with data in just one year : 2016). In future, you can download other data here : https://public.opendatasoft.com/explore/dataset/donnees-synop-essentielles-omm/export/

## Material¶

The file contains lines of the form:

ID OMM station,Date,Average wind 10 mn,Temperature,Humidity,Rainfall 3 last hours,Station

7761,2016-01-01T01:00:00+01:00,2.0,283.75,94,0.2,AJACCIO 7761,2016-01-01T04:00:00+01:00,2.2,283.95,91,0.2,AJACCIO 7761,2016-01-01T07:00:00+01:00,1.7,284.05,88,0.2,AJACCIO 7761,2016-01-01T10:00:00+01:00,1.6,287.05,75,0.2,AJACCIO 7761,2016-01-01T13:00:00+01:00,3.1,289.55,73,0.0,AJACCIO

This is a classic csv file with separated data by "," The first line is the header.

## Information to extract¶

We want to compute, some statistics for this station

## Warning¶

The temperature measurement is in kelvin (273,15 K $\leftrightarrow$ 0 °C)

Write a script with a function load_data() that

• open the file
• load data in one of the following structures (more details below):
• 1.1 Single dictionnary
• 1.2 Multiple structures
• 1.3 Class instance (object-oriented)

## 1.1: Single dictionnary (pres07)¶

Load in the data in a single dictionnary of this structure:

{'Date': [wind,temperature,humidity,rainfall]}


For example

{
'2016-01-01T01': [2.0,283.75,94,0.2],
'2016-01-01T04': [2.2,283.95,91,0.2]
}


In this case, we can consider YYYY-MM-DDTHH as the key for the station dictionary.

Split each line and extract data.

#### Hint¶

You can use the method split from the str class.

In [1]:
s = "I am lucky"
l = s.split()
print(l)

['I', 'am', 'lucky']


## 1.2: Multiple structures (pres07)¶

Load the data in multiple dictionnaries or lists (one per field).

#### Example for dictionnaries:¶

You can use the following structure

wind = {'Date1': wind_value1, 'Date2: wind_value2, ...}
temperature = {'Date1': temperature1, 'Date2: temperature2, ...}
...


#### Example for lists:¶

You can use the following structure

dates = ['Date1', 'Date2', ...]
wind = [wind_value1, wind_value2, ...]
temperature = [temperature1, temperature2, ...]
...


## 1.3: Class instance (pres08)¶

Load the data in an instance of a class WeatherStation that you will define yourself. load_data() can therefore be a method of this class.

#### Hint :¶

This is very similar as 1.2. The only difference is that the structures storing the data are attributes of a class.

# Step 2: Compute max temperature and average temperature for the station¶

Write 2 functions get_max_temperature() and get_average_temperature() that:

• return a float

# Step 3: Compute sum of the rainfall for one station¶

Write 1 function get_sum_rainfall() that sum the rainfall.

• return a float

Be careful, some measurement have no rainfall data.

# Step 4: Search max period without rainfall¶

Write 1 function period_without_rainfall()

• return the beginning date, the ending date and the number of days without rainfall

## Hint¶

This is the syntax to return multiple values in a function:

return date_min, date_max, period_max / 8

# Step 5: How many hours with humidity rate < 60¶

Write 1 function get_hours_humidity(rate)

• takes 1 parameter : the humidity rate
• returns the number of days

## Final remark: Pandas¶

To do such data analysis, ones should not use pure Python code without external library! The library Pandas has been written to do this in few lines:

In [3]:
import pandas

# print(df.columns)
# print(df.age)
# print(df['Temperature'])
# print(df[(df['Station']=='AJACCIO')])
# print(df[(df['Station'] == 'AJACCIO')]['Rainfall 3 last hours'].sum())

temp = df[(df['Station'] == 'AJACCIO')]['Temperature'].mean()-273.15
print(f'The average of temperature at Ajaccio is {temp:.1f} °C.')

The average of temperature at Ajaccio is 16.3 °C.

