Fast prototyping (Numpy!)
Popular:
Well-known
Several great libraries
Share ideas between developers / scientists
Popularity counts
Readability counts
Expressivity counts
Anyway, one needs a good and well-known scripting language so yes!
(even considering Julia)
Designed for fast prototyping & "glue" codes together
Generalist + easy to learn ⇒ huge and diverse community 👨🏿🎓🕵🏼 👩🏼🎓 👩🏽🏫👨🏽💻👩🏾🔬 🎅🏼 🌎 🌍 🌏
Expressivity and readability
Not oriented towards high performance
(fast and easy dev, easy debug, correctness)
Highly dynamic 🐒 + introspection (inspect.stack()
)
Automatic memory management 💾
All objects encapsulated 🥥 (PyObject, C struct)
Objects accessible through "references" ➡️
Usually interpreted
Interpreted (nearly) instruction per instruction, (nearly) no code optimization
The numerical stack (Numpy, Scipy, Scikits, ...) based on the CPython C API (CPython implementation details)!
Optimized implementation with tracing Just-In-Time compilation
The CPython C API is an issue! PyPy can't accelerate Numpy code!
For microcontrollers
mylist = [1, 3, 5]
list
: array of references towards PyObjects
arr = 2 * np.arange(10)
print(arr[2])
4
Pure Python terrible 🐢 (except with PyPy)...
from math import sqrt
my_const = 10.
result = [elem * sqrt(my_const * 2 * elem**2) for elem in range(1000)]
but even this is not very efficient (temporary objects)...
import numpy as np
a = np.arange(1000)
result = a * np.sqrt(my_const * 2 * a**2)
Even slightly worth with PyPy 🙁
Cprofile (pstats, SnakeViz), line-profiler, perf
, perf_events
"Premature optimization is the root of all evil" (Donald Knuth)
80 / 20 rule, efficiency important for expensive things and NOT for small things
For example, using Numpy arrays instead of Python lists...
unittest
, pytest
pipelining, hyper-threading, vectorization, advanced instructions (simd), ...
important to get data aligned in memory (arrays)
What does CPython (compile
, "byte code", nearly no optimization, see dis
module)
Just-in-time
Has to be fast (warm up), can be hardware specific
Ahead-of-time
Can be slow, hardware specific or more general to distribute binaries
Compilers are usually good for optimizations! Better than most humans...
From one language to another language (for example Python to C++)
handled by the OS
share memory and can use at the same time different CPU cores
How?
OpenMP (Natively in C / C++ / Fortran. For Python: Pythran, Cython, ...)
In Python: threading
and concurrent.futures
⚠️ in Python, one interpreter per process (~) and the Global Interpreter Lock (GIL)...
In a Python program, different threads can run at the same time (and take advantage of multicore)
But... the Python interpreter runs the Python bytecodes sequentially !
Terrible 🐌 for CPU bounded if the Python interpreter is used a lot !
No problem for IO bounded !
Many tools to interact with static languages:
ctypes, cffi, cython, cppyy, pybind11, f2py, pyo3, ...
Glue together pieces of native code (C, Fortran, C++, Rust, ...) with a nice syntax
⇒ Numpy, Scipy, ...
Remarks:
Numpy: great syntax for expressing algorithms, (nearly) as much information as in Fortran
Performance of a @ b
(Numpy) versus a * b
(Julia)?
Same! The same library is called! (often OpenBlas or MKL)
Don't use too often the Python interpreter (and small Python objects) for computationally demanding tasks.
Pure Python
→ Numpy
→ Numpy without too many loops (vectorized)
→ C extensions
But ⚠️ ⚠️ ⚠️ writting a C extension by hand is not a good idea ! ⚠️ ⚠️ ⚠️
compile Python
write C extensions without writing C
Cython, Numba, Pythran, Transonic, PyTorch, ...
Performance issues, especially for crunching numbers 🔢
⇒ need to accelerate the "numerical kernels"
Many good accelerators and compilers for Python-Numpy code
⇒ We shouldn't have to write specialized code for one accelerator!
Other languages don't replace Python for sciences
Modern C++ is great and very complementary 💑 with Python
Julia is interesting but not the heaven on earth