Perceptrons

Cloistered Monkey

2018-10-20 17:43

Imports

From PyPi

from graphviz import Digraph
import matplotlib.pyplot as pyplot
import numpy
import pandas
import seaborn

Setup the Plotting

%matplotlib inline
seaborn.set(style="whitegrid")
FIGURE_SIZE = (14, 12)

What is a Perceptron?

A perceptron is a model based on the neuron that works as a linear classifier.

Our acceptance model from the previous post:

\[ 2x_1 + x_2 - 18 = 0 \]

Would be modeled by something like this.

graph = Digraph(comment="Perceptron", format="png")
graph.graph_attr["rankdir"] = "LR"
graph.node("a", "Test")
graph.node("b", "Grade")
graph.node("c", "-18")
graph.node("d", "Label")
graph.edge("a", "c", label="2")
graph.edge("b", "c", label="1")
graph.edge("c", "d")
graph.render("graphs/perceptron.dot")
graph

perceptron.dot.png

Where the weights on the edges of the graph are multiplied by the values from the input nodes and then added together with -18. The constant we add at the end is called a bias value, and an alternative way to notate it is to add an input node for it that always has a value of 1 for the input and -18 for the edge. This is equivalent to the previous graph but makes it a little more consistent.

graph = Digraph(comment="Perceptron 2", format="png")
graph.graph_attr["rankdir"] = "LR"
graph.node("a", "Test")
graph.node("b", "Grade")
graph.node("e", "Bias=1")
graph.node("c", "+")
graph.node("d", "Label")
graph.edge("a", "c", label="2")
graph.edge("b", "c", label="1")
graph.edge("e" , "c", label="-18")
graph.edge("c", "d")
graph.render("graphs/perceptron_2.dot")
graph

perceptron_2.dot.png

We can re-draw the graph to make it more explicit that there is a separate step-wise function to convert the score to a label, as well as use the more general notation of x and w.

graph = Digraph(comment="Perceptron 3", format="png")
graph.graph_attr["rankdir"] = "LR"
graph.node("a", "Test")
graph.node("b", "Grade")
graph.node("e", "Bias=1")
graph.node("c", "+")
graph.node("d", "Step Function")
graph.node("f", "Label")
graph.edge("a", "c", label="2")
graph.edge("b", "c", label="1")
graph.edge("e" , "c", label="-18")
graph.edge("c", "d")
graph.edge("d", "f")
graph.render("graphs/perceptron_3.dot")
graph

perceptron_3.dot.png

Why are these called neural networks?

The perceptron is modeled on a neuron in the brain which takes signal inputs and decides to fire (or not) based on these inputs.

Can perceptrons do logic?

The AND operator

Here's the truth table for the And operator.

Input 1	Input 2	Output
0	0	0
0	1	0
1	0	0
1	1	1

So how would you make a perceptron for this? Remember that the perceptron is a linear separator, so if you think if the two inputs as axes on the plane, you would fit a line that separates the outputs of 0 from the output of 1.

A Perceptron Class

class Perceptron:
    """Simple single perceptron

    Args:
     weight_x: weight for input x values
     weight_y: weight for input y values (x2)
     bias: bias scalar
    """
    def __init__(self, weight_x: float, weight_y: float, bias: float) -> None:
        self.weight_x = weight_x
        self.weight_y = weight_y
        self.bias = bias
        return

    def score(self, x: float, y: float) -> float:
        """calculate score for the inputs

       Args:
        x, y: inputs to the linear equation

       Returns:
        score: value representing which side of the line the point is
       """
        return self.weight_x * x + self.weight_y * y + self.bias

    def separator(self, x:float) -> float:
        """generates the values for the separation line

       Args: 
        x: the input value to generate the y-value for

       Returns:
        y: value for the plot given x
       """
        return -(self.weight_x * x + self.bias)/self.weight_y

    def update(self, weights: numpy.ndarray) -> None:
        """Updates the weights

       Args:
        weights: array of new weights (including bias)
       """
        self.weight_x = weights[0]
        self.weight_y = weights[1]
        self.bias = weights[2]
        return

    def __call__(self, x:float, y:float) -> int:
        """converts the score to a label

       This is the stepwise function

       Args:
        x, y: point values to check 

       Returns:
        label: 1 if right of the line, 0 otherwize
       """
        return int(self.score(x, y)>=0)

A Truth Table Printer

def truth_table(perceptron):
    binary = [0, 1]
    print("|Input 1|Input 2| Label|")
    print("|-+-+-|")
    for input_1 in binary:
        for input_2 in binary:
            output = perceptron(input_1, input_2)
            print(
                "|{}|{}|{}|".format(
                    input_1, input_2, output))
    return

perceptron_and = Perceptron(weight_x=1, weight_y=1, bias=-1.5)

So now here's the perceptron's truth table.

truth_table(perceptron_and)

Input 1	Input 2	Label
0	0	0
0	1	0
1	0	0
1	1	1

figure, axe = pyplot.subplots()
axe.set_xlim((-.1, 1.1))
axe.set_ylim((-.1, 1.1))
axe.plot([0,0,1], [0, 1, 0], "bo", label="Not AND")
axe.plot([0.4, 1.1], [perceptron_and.separator(0.4), perceptron_and.separator(1.1)], "k")
axe.plot([1], [1], "ro", label="AND")
axe.set_title("Logical AND")
legend = axe.legend()

Perceptron OR

A similar thing can be done for the OR operator.

Input 1	Input 2	Output
0	0	0
0	1	1
1	0	1
1	1	1

perceptron_or = Perceptron(weight_x=1, weight_y=1, bias=-0.5)

And once again I'll check that the perceptron can replicate the truth table.

truth_table(perceptron_or)

Input 1	Input 2	Label
0	0	0
0	1	1
1	0	1
1	1	1

figure, axe = pyplot.subplots()
axe.plot([0], [0], "bo", label="Not OR")
axe.set_xlim((-.1, 1.1))
axe.set_ylim((-.1, 1.1))
axe.plot([-0.1, 0.8], [perceptron_or.separator(-0.1),
                       perceptron_or.separator(0.8)], "k")
axe.plot([0, 1, 1], [1, 0, 1], "ro", label="OR")
axe.set_title("Logical OR")
legend = axe.legend()

If you look at the plot you can see that the separator has to move lower, so, somewhat unintuitively your intercept (bias) should be less negative.

Or you should give more weight to the inputs.

perceptron_or_2 = Perceptron(weight_x=2.5, weight_y=2, bias=-1.5)

And here's the table generated with the same bias as the AND perceptron but with heavier weights.

truth_table(perceptron_or_2)

Input 1	Input 2	Label
0	0	0
0	1	1
1	0	1
1	1	1

figure, axe = pyplot.subplots()
axe.plot([0], [0], "bo", label="Not OR")
axe.set_xlim((-.1, 1.1))
axe.set_ylim((-.1, 1.1))
axe.plot([-0.1, 0.8], [perceptron_or_2.separator(-0.1),
                       perceptron_or_2.separator(0.8)], "k")
axe.plot([0, 1, 1], [1, 0, 1], "ro", label="OR")
axe.set_title("Logical OR")
legend = axe.legend()

Seems to work okay.

NOT

The NOT operation only looks at one input. To re-use our perceptron we can set the weights so it ignores the first input and negates the second.

Here's the Truth Table for NOT.

X	NOT
0	1
1	0

So now we create a perceptron with an x-weight of 0.

perceptron_not = Perceptron(weight_x = 0, weight_y=-1, bias=0.5)

And see the output.

truth_table(perceptron_not)

Input 1	Input 2	Label
0	0	1
0	1	0
1	0	1
1	1	0

The table is overkill, since we only need to test two outputs, but it shows that even with the same inputs as the other perceptrons it can negate the second input.

figure, axe = pyplot.subplots()
axe.set_xlim([-.1, 1.1])
axe.set_ylim([-.1, 1.1])
axe.plot([0, 1], [0, 0], "ro", label="False")
axe.plot([0, 1], [1, 1], "bo", label="True")
axe.plot([-.1, 1.1], [perceptron_not.separator(-.1), 
                      perceptron_not.separator(1.1)], "k")
axe.set_title("Logical NOT")
legend = axe.legend()

What about the XOR?

The XOR operator only returns True if one or the other input is True, not if both are True.

Input 1	Input 2	XOR
0	0	0
0	1	1
1	0	1
1	1	0

figure, axe = pyplot.subplots()
axe.set_title("XOR")
axe.plot([0, 1], [1, 0], "ro", label="XOR")
axe.plot([0, 1], [0, 1], "bo", label="Not XOR")
legend = axe.legend()

If you look at the plot you can see that a single straight line won't separate the blue and the red dots. The solution turns out to add a layers of perceptrons to make it work.

graph = Digraph(comment="Multilayer Perceptron", format="png")
graph.graph_attr['rankdir'] = "LR"
graph.node("a", " ")
graph.node("b", " ")
graph.node("c", "A")
graph.node("d", "B")
graph.node("e", "C")
graph.node("f", "AND")
graph.node("g", "XOR")
graph.edges(["ac", "ad", 'bc', 'bd', 'ce', 'ef', 'df', 'fg'])
graph.render("graphs/multilayer_perceptron.dot")
graph

multilayer_perceptron.dot.png

A, B, and C are OR, NOT and AND perceptrons, the key is to figure out which is which. The trick is to notice that AND is True when both are true, so we want B to to be 1 everytime there is at least one True and C to negate the one case when they're both True. So B is an OR, C is NOT and A is an AND (because and is True only when they're both True and C negates it).

graph = Digraph(comment="Multilayer Perceptron", format="png")
graph.graph_attr['rankdir'] = "LR"
graph.node("a", " ")
graph.node("b", " ")
graph.node("c", "AND 1")
graph.node("d", "OR")
graph.node("e", "NOT")
graph.node("f", "AND 2")
graph.node("g", "XOR")
graph.edges(["ac", "ad", 'bc', 'bd', 'ce', 'ef', 'df', 'fg'])
graph.render("graphs/multilayer_perceptron_2.dot")
graph

multilayer_perceptron_2.dot.png

Input 1	Input 2	AND 1	NOT	OR	AND 2
0	0	0	1	0	0
0	1	0	1	1	1
1	0	0	1	1	1
1	1	1	0	1	0

It's not the clearest table, but AND 1 and OR both take the original inputs, then NOT negates AND 1 and the output of NOT and OR feed into AND 2 which puts out our exclusive or.

Here's what happens when we use the perceptrons we created earlier to generate the same table.

inputs = [[0, 0],
          [0, 1],
          [1, 0],
          [1, 1]]

print("| Input 1 | Input 2 | AND 1 | NOT | OR | XOR |")
print("|---------+---------+-------+-----+----+-------|")
row = "|" + "{}|" * 6
for (x, y) in inputs:
    and_1 = perceptron_and(x, y)
    nand = perceptron_not(0, and_1)
    or_1 = perceptron_or(x, y)
    xor = perceptron_and(nand, or_1)
    print(row.format(x, y, and_1, nand, or_1, xor))

Input 1	Input 2	AND 1	NOT	OR	XOR
0	0	0	1	0	0
0	1	0	1	1	1
1	0	0	1	1	1
1	1	1	0	1	0

Looks right.

Nand

that the combination or AND and NOT is a NAND operator so you could simplify the diagram a little back to just two layers.

graph = Digraph(comment="Two-Layer XOR Perceptron", format="png")
graph.graph_attr['rankdir'] = "LR"
graph.node("a", " ")
graph.node("b", " ")
graph.node("c", "NAND")
graph.node("d", "OR")
graph.node("f", "AND")
graph.node("g", "XOR")
graph.edges(["ac", "ad", 'bc', 'bd', "cf", 'df', 'fg'])
graph.render("graphs/two_layer_xor.dot")
graph

two_layer_xor.dot.png

That's a lot of work to get an XOR, how are we going to classify images like this?

This isn't about image classification, but we probably should note that you normally wouldn't try and figure out the parameters by hand, the perceptron can tune itself. The way it does this is by picking some initial random values and then it repeatedly tests how well it did and adjusts the weights.

How does it adjust the weights?

This is what's called "the Perceptron Trick". It basically forms a vector for each of the misclassified points \((x_1, x_2, 1)\) and subtracts it from the weights \((w_1, w_2, b)\) if the score was on the positive side and adds to it if the score was on the negative side. It then repeats this for each of the misclassified points. Since we have multiple points we don't want it to just make these huge jumps, so the misclassified points vector is multiplied by some fraction (called the learning rate) so that the changes are small. Once it has the new weights it then tests itself again and makes another adjustment.

Question

If the original line was \(3x_1 + 4x_2 - 10 = 0\), and the learning rate was set to 0.1, how many adjustments would you have to make to reach the point (1, 1)?

x, y = 1, 1
weights = numpy.array([3, 4, -10])
learning_rate = 0.1
adjustment = numpy.ones(3) * learning_rate
output = 0
adjustments = 0
perceptron = Perceptron(weights[0], weights[1], weights[2])
while True:
    output = perceptron.score(x, y)
    if output >= 0:
        break
    print(output)
    adjustments += 1
    direction = 1 if output < 0 else -1
    weights = weights + direction * adjustment
    perceptron.update(weights)
print("Final Output: {}".format(output))
print(print("Adjustments: {}".format(adjustments)))

-3
-2.700000000000001
-2.4000000000000012
-2.1000000000000014
-1.8000000000000025
-1.5000000000000036
-1.2000000000000028
-0.9000000000000039
-0.600000000000005
-0.30000000000000604
-7.105427357601002e-15
Final Output: 0.29999999999999183
Adjustments: 11
None

I came up with 11 adjustments but Udacity says 10… Close enough, I guess.

Table of Contents