Practical numpy : image processing

The goal of this practical session is to extract information from a set of images. Suppose we are in front of a beautifull place (e.g. the Taj Mahal in India).

As the place is very nice, there is plenty of people moving around and we cannot take a photo without having somebody in front of the building.

The idea consist in taking many images from exactly the same place. A given pixel in the image will have different values: either the value corresponding to the building (most of the time) or the value corresponding to a person that lies between the camera and the building.

Assuming, most of the time, the pixel value corresponds to the building, we can get this value by computing the median of the different values taken by this pixel.

The goal of this practical session is to write a script that does this job.

We will assume here that all the images can fit in memory.

Get the list of images

Write a function that takes as input a directory and a suffix and return the list of files in the directory that ends up with the given suffix


  • use os.listdir to list the set of files in a directory
  • use str.endswith to filter this set
  • use list comprehension to do it in one run

for the more advanced:

  • check also the name corresponds to a file (and not a directory)
  • do things such that the pattern can be a string that represent a regex (e.g. ".jpeg|.jpg|.JPEG|.JPG").

Load one image

  • load one image into a numpy array
  • get the shape of the image
  • check all the images has the same shape


  • check scipy.misc.imread function
  • check what is arr.shape (where arr is a numpy array)

Stack the images arrays in one big array

We are assuming here that all the images fits in memory. We

our problem now is to load all the images as a unique numpy array. Fortunatelly, this exact question has been asked (and answered) on stackoverflow:

  • read carrefully the post
  • choose the solution that seems the more relevant to you
  • write a function that takes as input a list of image names and return a numpy array of dimension (nb_images, width, height, color_dim) where color_dim is 1 in case of grey scale image and 3 in case of color image.
  • handle errors: raise exception if all the images cannot be loaded, or if all the images do not share the same shape.

for the more advanced

  • implement different versions proposed in the link above and check which one is the fastest.

Compute the resulting image

  • check the np.median function
  • check with a small example how you will use it to compute the median on the right axis
  • call the median function on your numpy array and save it as an image


- check the function `scipy.misc.imsave`

For the advanced

Modify you code to have an "out of core" algorithm (ie do not load all the image in memory):

  • write a function that build the median image restricted to a subset of the images (e.g. compute_median(img_list, line_start, line_end)
  • write a function that reconstruct the full median array out of the set of partial arrays.
  • save the image

For the very advanced

  • implement a parallel version of the above algorithm (see e.g. the concurrent.futures module or the multiprocessing module)