Convolution Exploration

Beginning

This is a look at how convolution and pooling works. We're going to create three filters and apply them to an image to see how it changes them. Then we're going to apply Max-Pooling to the same image to see how this affects it.

Imports

Python

from argparse import Namespace
from functools import partial

PyPi

import cv2
import holoviews
import numpy
from scipy import misc

My Stuff

from graeae import EmbedHoloviews

Some Setup

The Plotting

holoviews.extension("bokeh")
Embed = partial(EmbedHoloviews,
               folder_path="../../files/posts/keras/convolution-exploration/")
Plot = Namespace(
    width=700,
    height=700,
)

Middle

The Ascent

This is an exploration of how convolutions work that uses a grayscale image provided by scipy called ascent.

image = misc.ascent()
plot = holoviews.Image(image).opts(
    title="Ascent",
    height=Plot.height,
    width=Plot.width,
    tools=["hover"],
)
Embed(plot=plot, file_name="ascent")()

Figure Missing

Note that, as I mentioned before, this is a grayscale image - Holoviews artificially tints it.

Now we're going to make a copy of the image and get its dimensions.

transformed_image = numpy.copy(image)
rows, columns = transformed_image.shape
print(f"Rows: {rows}, Columns: {columns}")
Rows: 512, Columns: 512

Creating Filters

A First Filter

Our filter is going to be a 3 x 3 array. The original lesson said that it is an edge-detector, but I don't think that that's the case.

filter_1 = numpy.array([[-1, -2, -1], 
                        [0, 0, 0], 
                        [1, 2, 1]])
print(filter_1.sum())
0
  • Applying the Filter

    To apply the filter we need to traverse the cells and multiply the filter by the values of the image that match the current location of the filter. Since the filter is a 3 x 3 array, there is a 1-pixel "padding" around the center so when we do the traversals we start and end one row and column away from the each edge. After multiplying the filter by the section of the image that it overlays, we sum it and then make sure that it stays within the 0 - 255 range that's valid for images. Finally, the value our filter created is stored back into our transformed_image.

    class Convolver:
        """Applies a convolution to an image
    
        Args:
         image: the source image to convolve
         image_filter: the filter to apply
         identifier: something to identify the filter
        """
        def __init__(self, image: numpy.ndarray, image_filter: numpy.ndarray,
                     identifier: str):
            self.image = image
            self.identifier = identifier
            self._image_filter = None
            self.image_filter = image_filter
            self._transformed_image = None
            self._rows = None
            self._columns = None
            return
    
        @property
        def image_filter(self) -> numpy.ndarray:
            """The filter to apply to the image"""
            return self._image_filter
    
        @image_filter.setter
        def image_filter(self, new_filter: numpy.ndarray) -> None:
            """Stores the filter normalized to zero or one"""
            if new_filter.sum() != 0:
                new_filter = new_filter/new_filter.sum()
                print(f"Filter sum: {new_filter.sum()}")
            self._image_filter = new_filter
            return
    
        @property
        def rows(self) -> int:
            """the number of rows in the image"""
            if self._rows is None:
                self._rows, self._columns = self.image.shape
            return self._rows
    
        @property
        def columns(self) -> int:
            """number of columns in the image"""
            if self._columns is None:
                self._rows, self._columns = self.image.shape
            return self._rows
    
        @property
        def transformed_image(self) -> numpy.ndarray:
            """The image to transform"""
            if self._transformed_image is None:
                self._transformed_image = self.image.copy()
                for row in range(1, self.rows - 1):
                    for column in range(1, self.columns - 1):
                        convolution = (
                            self._transformed_image[
                                row - 1: row + 2, 
                                column-1: column + 2] * self.image_filter).sum()
                        convolution = max(0, convolution)
                        convolution = min(255, convolution)
                        self._transformed_image[row, column] = convolution
            return self._transformed_image
    
        def plot(self) -> None:
            """Plots the transformed image
           """
            height = width = Plot.height - 200
            image_1 = holoviews.Image(self.transformed_image).opts(
                height=height,
                width=width,
            )
            image_2 = holoviews.Image(self.image).opts(
                height=height,
                width=width,
            )
            plot = (image_2 + image_1).opts(
                title=f"Ascent Transformed ({self.identifier})"
            )
            Embed(plot=plot, file_name=self.identifier, 
                  height_in_pixels=height + 100)()
    
  • Looking at the Convolution's Output
    convolver = Convolver(image, filter_1, "filter_1")
    convolver.plot()
    

    Figure Missing

    This looks like it might be a contrast filter.

Try Another Filter

filter_2 = numpy.array([
    [0, 1, 0], 
    [1, -4, 1], 
    [0, 1, 0]])
convolver = Convolver(image, filter_2, "filter_2")
convolver.plot()

Figure Missing

I'm not sure what that filter is. It seems to find the darkest parts of the image.

Filter 3

filter_3 = numpy.array([
    [-1, 0, 1], 
    [-2, 0, 2], 
    [-1, 0, 1]])
convolver = Convolver(image, filter_3, "filter_3")
convolver.plot()

Figure Missing

I'm not sure exactly what that's doing. Based on what the filter looks like I would guess that it's finding vertical and horizontal lines.

Pooling

Now we'll look at what happens when you apply a halving (2, 2) pooling to an image. This iterates over every other pixel, looking at the current pixel, the pixel to the right, below and diagonally below and to the right and keeping the highest value in those four pixels.

output = numpy.zeros((int(rows/2), int(columns/2)))
stride = 2
for row in range(0, rows, stride):
    for column in range(0, columns, stride):
        pixel = image[row: row+2, column: column+2].max()
        output[int(row/2), int(column/2)] = pixel

print(f"Original Shape: {image.shape}")
print(f"Pooled Shape: {output.shape}")
Original Shape: (512, 512)
Pooled Shape: (256, 256)
plot = holoviews.Image(output).opts(
    height=Plot.height,
    width=Plot.width,
    title="Ascent With Pooling",
)
Embed(plot=plot, file_name="pooling")()

Figure Missing

The thing to note here is that, even though the image is half the size, you can still make out the features (although there is some loss of resolution).