Non-Linear Regions
Table of Contents
What's this about?
The perceptron seems to work fairly well with our admissions problem, but that's because our data is seperable with a straight line. What if we need a curved line? This should also work, the secret sauce is how we define our error function.
Continuous vs Discrete
It turns out that if your values are discrete (rather than continuous) you might have a very difficult time tuning the algorithm, because our learning rate will keep it vascillating between solutions, so for this to work, we need a continuous solution space.
Gradent Descent
Using descending a mountain as a metaphor, our goal is to look around and find the path that will take us the furthest down the mountain. In mathematical terms this means searching in the space adjacent to where we are and finding the solution that gives us the greatest reduction in error.
Discrete Vs Continuous Again (The Sigmoid)
Our categorizations are discrete, but our model doesn't work for discrete values… how do we reconcile this? The trick is to use continuous values and then, instead of using the stepwise function to classify the outcomes, we use the sigmoid function. This converts interpretation from a discrete 0 or 1 to a probability.
\[ \sigma(x) = \frac{1}{1 + e^{-x}} \]
Question
The score is defined as \(4x_1 + 5x_2 - 9\), which of the following values has a 50% probability of being blue or red?
Imports
python standard library
from math import exp
def probability(x):
return 4 * x[0] + 5 * x[1] - 9
def sigmoid(x):
return 1/(1+exp(-x))
inputs = [[1,1], [2, 4], [5, -5], [-4, 5]]
for x in inputs:
p = probability(x)
print("{}: {}".format(x, sigmoid(p)))
[1, 1]: 0.5 [2, 4]: 0.9999999943972036 [5, -5]: 8.315280276641321e-07 [-4, 5]: 0.5