A training to acquire strong basis in Python to use it efficiently
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)
4 built-in containers: list, tuple, set and dict...
For more containers: see collections...
Lists are mutable ordered tables of inhomogeneous objects. They can be viewed as an array of references (nearly pointers) to objects.
# 2 equivalent ways to define an empty list
l0 = []
l1 = list()
assert l0 == l1
# not empty lists
l2 = ["a", 2]
l3 = list(range(3))
print(l2, l3, l2 + l3)
print(3 * l2)
['a', 2] [0, 1, 2] ['a', 2, 0, 1, 2] ['a', 2, 'a', 2, 'a', 2]
The itertools
module provide other ways of iterating over lists or set of lists (e.g. cartesian product, permutation, filter, ... ).
The builtin function dir
returns a list of name of the attributes. For a list, these attributes are python system attributes (with double-underscores) and 11 public methods:
print(dir(l3))
['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
l3.append(10)
print(l3)
l3.reverse()
print(l3)
[0, 1, 2, 10] [10, 2, 1, 0]
# Built-in functions applied on lists
# return lower value
print(min(l3))
# return higher value
print(max(l3))
# return sorted list
print(sorted([5, 2, 10, 0]))
0 10 [0, 2, 5, 10]
# "pasting" two lists can be done using zip
l1 = [1, 2, 3]
s = "abc"
print(list(zip(l1, s)))
print(list(zip("abc", "defg")))
[(1, 'a'), (2, 'b'), (3, 'c')] [('a', 'd'), ('b', 'e'), ('c', 'f')]
list
: list comprehension¶They are iterable so they are often used to make loops. We have already seen how to use the keyword for
. For example to build a new list (side note: x**2
computes x^2
):
l0 = [1, 4, 10]
l1 = []
for number in l0:
l1.append(number ** 2)
print(l1)
[1, 16, 100]
There is a more readable (and slightly more efficient) method to do such things, the "list comprehension":
l1 = [number ** 2 for number in l0]
print(l1)
[1, 16, 100]
# list comprehension with a condition
[s for s in ["a", "bbb", "e"] if len(s) == 1]
['a', 'e']
# lists comprehensions can be cascaded
[(x, y) for x in [1, 2] for y in ["a", "b"]]
[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')]
tuple
: immutable sequence¶Tuples are very similar to lists but they are immutable (they can not be modified).
# 2 equivalent notations to define an empty tuple (not very useful...)
t0 = ()
t1 = tuple()
assert t0 == t1
# not empty tuple
t2 = (1, 2, "a") # with the parenthesis
t2 = 1, 2, "a" # it also works without parenthesis
t3 = tuple(l3) # from a list
# tuples only have 2 public methods (with a list comprehension)
[name for name in dir(t3) if not name.startswith("__")]
['count', 'index']
# assigment of multiple variables in 1 line
a, b = 1, 2
print(a, b)
# exchange of values
b, a = a, b
print(a, b)
1 2 2 1
tuple
: immutable sequence¶Tuples are used a lot with the keyword return
in functions:
def myfunc():
return 1, 2, 3
t = myfunc()
print(type(t), t)
# Directly unpacking the tuple
a, b, c = myfunc()
print(a, b, c)
<class 'tuple'> (1, 2, 3) 1 2 3
s0 = set()
{1, 1, 1, 3}
{1, 3}
set([1, 1, 1, 3])
{1, 3}
s1 = {1, 2}
s2 = {2, 3}
print(s1.intersection(s2))
print(s1.union(s2))
{2} {1, 2, 3}
set
: lookup¶Hashtable lookup (for example 1 in s1
) is algorithmically efficient (complexity O(1)), i.e. theoretically faster than a look up in a list or a tuple (complexity O(size iterable)).
print(1 in s1, 1 in s2)
True False
from random import randint, shuffle
n = 20
i = randint(0, n - 1)
print("integer remove from the list:", i)
l = list(range(n))
l.remove(i)
shuffle(l)
print("shuffled list: ", l)
integer remove from the list: 3 shuffled list: [7, 6, 15, 0, 14, 17, 10, 4, 12, 18, 9, 13, 1, 2, 5, 11, 8, 19, 16]
full_set = set(range(n))
changed_set = set(l)
ns = full_set - changed_set
ns.pop()
3
dict
: unordered set of key: value pairs¶The dictionary (dict
) is a very important data structure in Python. All namespaces are (nearly) dictionaries and "Namespaces are one honking great idea -- let's do more of those!" (The zen of Python).
A dict is a hashtable (a set) + associated values.
d = {}
d["b"] = 2
d["a"] = 1
print(d)
{'b': 2, 'a': 1}
d = {"a": 1, "b": 2, 0: False, 1: True}
print(d)
{'a': 1, 'b': 2, 0: False, 1: True}
dict
and list
¶You can first think about dict
as a super list
which can be indexed with other objects than integers (and in particular with str
).
l = ["value0", "value1"]
l.append("value2")
print(l)
['value0', 'value1', 'value2']
l[1]
'value1'
d = {"key0": "value0", "key1": "value1"}
d["key2"] = "value2"
print(d)
{'key0': 'value0', 'key1': 'value1', 'key2': 'value2'}
d["key1"]
'value1'
But warning, dict
are not ordered (since they are based on a hashtable)!
dict
: public methods¶# dict have 11 public methods (with a list comprehension)
[name for name in dir(d) if not name.startswith("__")]
['clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']
dict
: different ways to loop over a dictionary¶# loop with items
for key, value in d.items():
if isinstance(key, str):
print(key, value)
key0 value0 key1 value1 key2 value2
# loop with values
for value in d.values():
print(value)
value0 value1 value2
# loop with keys
for key in d.keys():
print(key)
key0 key1 key2
# dict comprehension (here for the "inversion" of the dictionary)
print(d)
d1 = {v: k for k, v in d.items()}
{'key0': 'value0', 'key1': 'value1', 'key2': 'value2'}
Write a function that returns a dictionary containing the number of occurrences of letters in a text.
text = "abbbcc"
def count_elem(sequence):
d = {}
for letter in sequence:
if letter not in d:
d[letter] = 1
else:
d[letter] += 1
return d
print("text=", text, "counts=", count_elem(text))
text= abbbcc counts= {'a': 1, 'b': 3, 'c': 2}