The goal of this session is to practice what we have seen in the first presentations:
We will write scripts that read a file (or a set of files) with a predefined format and compute simple quantities (sum, average, number) from the values in the files.
You will find a bunch of files in the directory TP/TP0_file_stats/data
file0.1.txt
¶This can be done through the following steps:
python3 step0.0.py
path_to_data = '../TP/TP0_file_stats/data/'
file = path_to_data + "file0.1.txt"
# on Windows, replace '/' by '\':
# file = r"..\data\file0.1.txt"
# r like "raw" = no interpretation of the special characters ("\n", "\t", etc.)
# Such r-strings are also useful when we write Latex code in Python.
nb = 78; sum = 42.46; avg = 0.54
def compute_stats(file_name):
""" computes the statistics of data in file_name
:param file_name: the name of the file to process
:type file_name: str
:return: the statistics
:rtype: a tuple (number, sum, average)
"""
pass
file0.1.txt
, file0.2.txt
and file0.3.txt
¶Same as step 0.1 but process many files and print file base statistics and overall statistics:
python3 step0.2.py
path_to_data = '../TP/TP0_file_stats/data/'
file = path_to_data + "file0.1.txt"
nb = 78 ; sum = 42.46 ; avg = 1.84
file = path_to_data + "file0.2.txt"
nb = 100 ; sum = 53.29 ; avg = 0.53
file = path_to_data + "file0.3.txt"
nb = 25 ; sum = 12.72 ; avg = 0.51
# total over all files:
nb = 203 ; sum = 108.47 ; avg = 0.53
file_with_comment_col0.txt
)¶Now suppose the files contains some comments (i.e. lines starting with a '#').
Adapt previous script so that we do not consider these lines (see file file_with_comment_col0.txt
).
Possible result:
python3 step1.0.py
file = ../TP/TP0_file_stats/data/file_with_comment_col0.txt
nb = 100; total = 53.29; avg = 0.53
file_with_comment_anywhere.txt
)¶Now suppose the file contains comments in the middle of the line
(see e.g. file_with_comment_anywhere.txt
that contains some comments
that mainly prevent the string to float conversion.
Adapt script 1.0 to handle this format.
Possible output:
python3 step1.1.py
path_to_data = '../TP/TP0_file_stats/data/'
file = path_to_data + "file_with_comment_col0.txt"
nb = 100 ; sum = 53.29 ; avg = 0.53
file = path_to_data + "file_with_comment_anywhere.txt"
nb = 96 ; sum = 51.65 ; avg = 0.54
# total over all files:
nb = 196 ; sum = 104.93 ; avg = 0.54
Now suppose the files are pre-formated with lines of the form
p1=0.7742 p2=0.74973 p3=0.77751
p1=0.7493 p2=0.34762 p3=0.44521
p1=0.4261 p3=0.88275 p2=0.74016
Possible output:
python3 step2.0.py
checking ../data/file_mut_cols.txt
checking ../data/file_mut_cols_with_error.txt
line 8 contains only 1 fields, expecting 3
line 15 contains only 2 fields, expecting 3
line 20: keys do not match the required keys: problem with keys {'p2'}
line 23: keys do not match the required keys: problem with keys {'p3', 'p7'}
default_values = {'p1': 1, 'p2': 2, 'p3': 3}