Calling a JSON API

Beginning

Imports

Python

from pprint import pprint
import json
import urllib

PyPi

from expects import (
    equal,
    expect,
    start_with,
)

Set Up

URL

API_KEY = 42
API_URL = "http://py4e-data.dr-chuck.net/json?"

Sample

SAMPLE_LOCATION = "South Federal University"
SAMPLE_PLACE_ID = "ChIJ9e_QQm0sDogRhUPatldEFxw"

Actual

ACTUAL_LOCATION = "Kazan Federal University"
ACTUAL_PLACE_ID_STARTS_WITH = "ChIJGf9"

Middle

def get_place_id(location: str) -> str:
    """Get the place ID for the location

    Args:
     location: place to look up

    Returns:
     the place ID for the location
    """
    parameters = {"address": location, "key": API_KEY}
    request = API_URL + urllib.parse.urlencode(parameters)

    with urllib.request.urlopen(request) as response:
        data = json.loads(response.read().decode())
        results = data["results"][0]
        expect(data["status"]).to(equal("OK"))
        place_id = results["place_id"]
        print(f"Location: {location}")
        print(f"Place ID: {place_id}")
    return place_id

The Sample

place_id = get_place_id(SAMPLE_LOCATION)
expect(SAMPLE_PLACE_ID).to(equal(place_id))
Location: South Federal University
Place ID: ChIJ9e_QQm0sDogRhUPatldEFxw

The Actual

place_id = get_place_id(ACTUAL_LOCATION)
expect(place_id).to(start_with(ACTUAL_PLACE_ID_STARTS_WITH))
Location: Kazan Federal University
Place ID: ChIJGf9kMxGtXkERIzwzBzFo8kY

End

Extracting Data From JSON

Beginning

Imports

Python

import json
import urllib

PyPi

from expects import expect, equal

Set Up

The URLs

SAMPLE_URL = "http://py4e-data.dr-chuck.net/comments_42.json"
ACTUAL_URL = "http://py4e-data.dr-chuck.net/comments_260445.json"

The Expected

SAMPLE_EXPECTED = 2553
ACTUAL_EXPECTED_LAST_TWO_DIGITS = 94

Middle

We're going to pull a JSON blob and extract the "count" values and sum them. The data is structured with a single top-level key ("comments") which holds a list of dicts with "name" (name of the commenter) and "count" (the number of comments the commenter has made) values.

def count(url: str) -> int:
    """Totals the comment counts

    Args:
     url: source of the JSON

    Returns:
     the total of the comment counts
    """
    response = urllib.request.urlopen(url)
    data = json.loads(response.read())
    total = 0
    for index, commenter in enumerate(data["comments"]):
        total += int(commenter["count"])

    print(f"Comments: {index + 1}")
    print(f"Comments: {total: ,}")
    return total

The Sample

total = count(SAMPLE_URL)
expect(total).to(equal(SAMPLE_EXPECTED))
Comments: 50
Comments:  2,553

The Assignment

total = count(ACTUAL_URL)
expect(int(str(total)[-2:])).to(equal(ACTUAL_EXPECTED_LAST_TWO_DIGITS))
Comments: 50
Comments:  2,594

End

Extracting Data From XML

Beginning

Imports

Python

from xml.etree import ElementTree
import urllib

Set Up

The URLS

SAMPLE_URL = "http://py4e-data.dr-chuck.net/comments_42.xml"
ACTUAL_URL = "http://py4e-data.dr-chuck.net/comments_260444.xml"

The Expected

SAMPLE_EXPECTED = 2553
ACTUAL_EXPECTED_LAST_TWO_DIGITS = 34

Middle

def get_counts(url: str) -> int:
    """Get the sum of the 'count' tags 

    Args:
     url: URL to get the XML from

    Returns:
     sum of the count tags payloads
    """
    response = urllib.request.urlopen(url)
    tree = ElementTree.fromstring(response.read())
    total = 0
    for index, count in enumerate(tree.findall(".//count")):
        total += int(count.text)

    print(f"Tags: {index + 1}")
    print(f"Sum of Counts: {total}")
    return total

Sample

total = get_counts(SAMPLE_URL)
assert total == SAMPLE_EXPECTED
Tags: 50
Sum of Counts: 2553

The Assignment

total =get_counts(ACTUAL_URL)
assert int(str(total)[-2:]) == ACTUAL_EXPECTED_LAST_TWO_DIGITS
Tags: 50
Sum of Counts: 2634

End

Web Scraping Assignment 1

Beginning

The goal of this exercise is to find all the <span> tags on a page and sum the numbers they contain.

Imports

Python

import urllib

PyPi

from bs4 import BeautifulSoup
from expects import (
    equal,
    expect,
    be_true,
)
from requests_html import HTMLSession

Setup

The URLs

SAMPLE_URL = " http://py4e-data.dr-chuck.net/comments_42.html"
ACTUAL_URL = "http://py4e-data.dr-chuck.net/comments_260442.html"

The Expected

SAMPLE_EXPECTED = 2553
ACTUAL_EXPECTED_LAST_DIGIT = 5

Middle

The Sample

The Way I Would Do It

def using_requests(url: str) -> int:
    """get the span total

    Args:
     url: The URL for the page

    Returns:
     the total sum
    """
    session = HTMLSession()
    response = session.get(url)
    expect(response.ok).to(be_true)
    total = 0

    for count, span in enumerate(response.html.find("span")):
        total += int(span.text)

    print(f"Count: {count + 1}")
    print(f"Sum: {total}")
    return total
total = using_requests(SAMPLE_URL)
expect(total).to(equal(SAMPLE_EXPECTED))
Count: 50
Sum: 2553

The Assignment Way

For this kind of thing, using urllib isn't really much more work, I'm used to the older python 2 version which (maybe only seemed at the time) was more complicated to use.

def using_urllib(url: str) -> int:
    """get the span total with urllib and beautiful soup

    Args:
     url: the URL for the page

    Returns:
     the total of the span contents
    """
    response = urllib.request.urlopen(url)
    soup = BeautifulSoup(response.read(), "html.parser")
    total = 0
    for count, span in enumerate(soup.find_all("span")):
        total += int(span.text)

    print(f"Count: {count + 1}")
    print(f"Sum: {total}")
    return total
total = using_urllib(SAMPLE_URL)
expect(total).to(equal(SAMPLE_EXPECTED))
Count: 50
Sum: 2553

The Assignment

Requests HTML

total = using_requests(ACTUAL_URL)
expect(int(str(total)[-1])).to(equal(ACTUAL_EXPECTED_LAST_DIGIT))
Count: 50
Sum: 2305

Urllib

total = using_urllib(ACTUAL_URL)
expect(int(str(total)[-1])).to(equal(ACTUAL_EXPECTED_LAST_DIGIT))
Count: 50
Sum: 2305

End

Although I normally use requests or requests-html, I must say that the urllib version with BeautifulSoup for this particular exercise wasn't much different.

Web Scraping Assignment 2

Beginning

The goal of this exercise is to crawl through a set of anchor links to get a particular name stored in the nth anchor tag. The assignment specifically says to use urllib but, if you go to the documentation for urllib.request it tells you to use to use requests, which, if you go to its documentation says that it's in maintenance mode while work is being done on Requests III… anyway, I like using Requests-HTML so I'll use that and urllib side-by-side.

Imports

Python

import re
import urllib

PyPi

from bs4 import BeautifulSoup
from requests_html import HTMLSession

Setup

The URL

BASE_URL = "http://py4e-data.dr-chuck.net/known_by_"
SAMPLE_URL = f"{BASE_URL}Fikret.html"
ASSIGNMENT_URL = f"{BASE_URL}Abdalroof.html"

Middle

The Sample Exercise

The Easy Way

session = HTMLSession()
response = session.get(SAMPLE_URL)
assert response.ok

expression = "_(?P<name>[^_.]+).html"
expression = re.compile(expression)
print(f"Name: {expression.search(SAMPLE_URL).group('name')}")

for hop in range(4):
    links = response.html.find("a")

    link_element = links[2]
    print(f"Name: {link_element.text}")
    response = session.get(link_element.attrs["href"])

print(f"Final Answer: {link_element.text}")
Name: Fikret
Name: Montgomery
Name: Mhairade
Name: Butchi
Name: Anayah
Final Answer: Anayah

The Slightly Less Easy Way

response = urllib.request.urlopen(SAMPLE_URL)

print(f"Name: {expression.search(SAMPLE_URL).group('name')}")
for hop in range(4):
   soup = BeautifulSoup(response.read(), "html.parser")
   link_element = soup.find_all("a")[2] 
   print(f"Name: {link_element.text}")
   response = urllib.request.urlopen(link_element["href"])
print(f"\nFinal Answer: {link_element.text}")
Name: Fikret
Name: Montgomery
Name: Mhairade
Name: Butchi
Name: Anayah

Final Answer: Anayah

The Real One

The Easy Way

session = HTMLSession()
response = session.get(ASSIGNMENT_URL)
assert response.ok

expression = "_(?P<name>[^_.]+).html"
expression = re.compile(expression)
print(f"Name: {expression.search(ASSIGNMENT_URL).group('name')}")

for hop in range(7):
    links = response.html.find("a")

    link_element = links[17]
    print(f"Name: {link_element.text}")
    response = session.get(link_element.attrs["href"])

print(f"Final Answer: {link_element.text}")
Name: Abdalroof
Name: Billi
Name: Jayse
Name: Amaarah
Name: Cesar
Name: Rosheen
Name: Mohamed
Name: Kiara
Final Answer: Kiara

The Assignment Way

HOPS = 7
FIND_AT_INDEX = 18 - 1
response = urllib.request.urlopen(ASSIGNMENT_URL)

print(f"Name: {expression.search(ASSIGNMENT_URL).group('name')}")
for hop in range(HOPS):
   soup = BeautifulSoup(response.read(), "html.parser")
   link_element = soup.find_all("a")[FIND_AT_INDEX] 
   print(f"Name: {link_element.text}")
   response = urllib.request.urlopen(link_element["href"])
print(f"\nFinal Answer: {link_element.text}")
Name: Abdalroof
Name: Billi
Name: Jayse
Name: Amaarah
Name: Cesar
Name: Rosheen
Name: Mohamed
Name: Kiara

Final Answer: Kiara

End

Nested follower

Beginning

This is assignment 1 from the Nature of Code course on Kadenze. I was originally going to make it a mouse-follower but I re-read the instructions and it seems like it's better to make it a random-walker. These are the requirements:

Create A Walker

  • Create an object that moves around the screen
  • Incorporate randomness or perlin noise

Specifications

  • Needs to be visually different from the Nature of Code examples
  • Use comments
  • Only use p5.js libraries

Middle

/**
 * Random Walker
 *
 * This is an implementation of the Random Walker based on the example given in
 * "The Nature of Code"
 */

// This is the div where the canvas will be placed
let nested_parent_div_id = "nested-follower";

/**
 * The sketch creator
 * 
 * @param {P5} p
 */
let nested_follower_sketch = function(p) {
    /**
     * Setup the canvas
     *
     * - Attaches the canvas to the div
     * - Creates the walker objects
     */
    p.setup = function() {
        this.canvas = p.createCanvas($("#" + nested_parent_div_id).outerWidth(true), 800);
        p.parent = new NoiseWalker(p);
        let followers = 10;
        p.followers = [new Follower(p, p.parent)];
        for (let step=0; step < followers; step++) {
            p.followers.push(new Follower(p, p.followers[step]));
        };
    };

    /**
     * Refresh the objects by calling their update functions
     *
     * This also clears the background.
     */
    p.draw = function() {
        p.background(255, 255, 255, 50);
        p.parent.update();
        p.followers.forEach(function(follower) {
            follower.update();
        });
    };
};

/**
 * The main walker (with perlin noise)
 *
 * @param {P5} p
 */
function NoiseWalker(p) {
    // Our kinematics vectors
    this.position = p.createVector(p.width/2, p.height/2);
    this.velocity = p.createVector(0, 0);
    this.acceleration = p.createVector(0, 0);

    // Time for the perlin noise function
    this.time_x = 0;
    this.time_y = 10000;
    this.time_delta = 0.01;

    // The limit of how much we'll accelerate
    this.max_acceleration = 0.05;

    /**
     * Updates the walker's position
     */
    this.walk = function() {
        // set the acceleration using perlin noise
        this.acceleration.x = p.map(p.noise(this.time_x), 0, 1,
                                    -this.max_acceleration,
                                    this.max_acceleration);
        this.acceleration.y = p.map(p.noise(this.time_y), 0, 1,
                                    -this.max_acceleration,
                                    this.max_acceleration);
        // Move the walker
        this.velocity = this.velocity.add(this.acceleration);
        this.position = this.position.add(this.velocity);

        // keep it within the window
        if (this.position.x < 0)
            this.velocity.x = p.abs(this.velocity.x);
        else if (this.position.x > p.width)
            this.velocity.x = -this.velocity.x;
        if (this.position.y < 0)
            this.velocity.y = p.abs(this.velocity.y);
        else if (this.position.y > p.height)
            this.velocity.y = -this.velocity.y;

        // update the time
        this.time_x += this.time_delta;
        this.time_y += this.time_delta;
    };

    /**
     * draws the walker (for debugging)
     */
    this.display = function() {
        p.stroke(0);
        p.point(this.position.x, this.position.y);
    };

    /**
     * Calls the walk function
     */
    this.update = function() {
        this.walk();
    };
}


/**
 * A follower that follows a parent object
 *
 * @param {P5} p
 * @param {NoiseWalker} parent
 */
function Follower(p, parent) {
    this.parent = parent;
    this.variance = p.random(1, 10);
    this.position = p.createVector(
        this.parent.position.x + p.random(-this.variance, this.variance),
        this.parent.position.y + p.random(-this.variance, this.variance));
    this.velocity = p.createVector(0, 0);

    // some colors to cycle through
    this.max_diameter = p.round(p.random(5, 200));
    this.time_x = p.random(100);
    this.time_y = p.random(10000, 11000);
    this.time_delta = 0.005;

    /**
     * Moves the Follower
     *
     * sets the acceleration by pointing to the parent's position
     */
    this.walk = function() {
        let acceleration = p5.Vector.sub(this.parent.position, this.position);       
        this.velocity = this.velocity.add(acceleration);
        this.position = this.position.add(this.velocity);
    };

    /**
     * Display the Follower
     *
     * cycles through the colors as we go
     */
    this.display = function() {
        // set our line width
        p.strokeWeight(p.map(p.noise(this.time_x, this.time_y),
                             0, 1, this.variance, 2 * this.variance));

        // set our color
        p.stroke(63, 63, 191);        

        // don't fill the object
        p.noFill();

        // draw our object with a random diameter
        p.ellipse(this.position.x, this.position.y,
                  p.round(p.random(5, this.max_diameter)),
                  p.round(p.random(5, this.max_diameter)));

        // update the time
        this.time_x += this.time_delta;
        this.time_y += this.time_delta;

    };

    /**
     * calls the update and walk 
     */
    this.update = function() {
        this.walk();
        this.display();
    };
}

// create the p5 object and attach it to the div
new p5(nested_follower_sketch, nested_parent_div_id);

End

A Mouse Follower

Table of Contents

Beginning

Instead of a random walker this walker will be attracted (somewhat) to the mouse cursor.

Middle

let parent_div_id = "mouse-follower";

let mouse_follower_sketch = function(p) {
    p.setup = function() {
        this.canvas = p.createCanvas($("#" + parent_div_id).outerWidth(true), 800);
        p.walker = new MouseWalker(p);
    }

    p.draw = function() {
        p.background(255);
        p.walker.walk();
        p.walker.display();
    }
};

function MouseWalker(p) {
    this.position = p.createVector(p.width/2, p.height/2);
    this.velocity = p.createVector(0, 0)

    this.walk = function() {
        mouse = p.createVector(p.mouseX, p.mouseY);
        // calling sub on the vectors does an in-place update
        // using p5.Vector.sub creates a new vector
        // This is a static method so we use the module (p5) not the instance (p)
        acceleration = mouse.sub(this.position);

        // setMag always produces the same magnitude (but the orientation stays the same)
        acceleration.setMag(0.1);
        this.velocity = this.velocity.add(acceleration);
        this.position = this.position.add(this.velocity)
  }

  this.display = function() {
      p.stroke(0);
      p.noFill();
      p.background(255, 255, 255, 25);
      p.ellipse(this.position.x, this.position.y, 48, 48);
  }
}

sketch_container = new p5(mouse_follower_sketch, parent_div_id);

End

  1. Shiffman D. The nature of code: simulating natural systems with processing. Version 1.0, generated December 6, 2012. s.l.: Selbstverl.; 2012. 498 p.

A Random Accelerator

Table of Contents

Beginning

This is an extension of the random walker with acceleration added.

Middle

let random_accelerator_sketch = function(p) {
    p.setup = function() {
        let parent_div_id = "random-accelerator";
        this.canvas = p.createCanvas($("#" + parent_div_id).outerWidth(true), 800);
        this.canvas.parent(parent_div_id);
        p.walker = new Walker(p);
    }

    p.draw = function() {
        p.background(255);
        p.walker.walk();
        p.walker.display();
    }
};

function Walker(p) {
    this.position = p.createVector(p.width/2, p.height/2);
    this.velocity = p.createVector(0, 0)

    this.walk = function() {
        acceleration = p.createVector(p.random(-1, 1), p.random(-1, 1));
        acceleration = acceleration.mult(0.1)
        this.velocity = this.velocity.add(acceleration)
        this.position = this.position.add(this.velocity)
  }

  this.display = function() {
      p.stroke(0);
      p.noFill();
      p.background(255, 255, 255, 25);
      p.ellipse(this.position.x, this.position.y, 48, 48);
  }
}

sketch_container = new p5(random_accelerator_sketch, 'random-accelerator');

End

This was a very rudimentary walker, the main point of it was that at this point we have the basic kinematic elements to make something following the rules of classical physics (more or less).

  1. Shiffman D. The nature of code: simulating natural systems with processing. Version 1.0, generated December 6, 2012. s.l.: Selbstverl.; 2012. 498 p.

A Random Walk(er)

Beginning

This is another post to see if I understand how to get p5.js working in nikola. It's been a while since I tried and I just want to see if I remember how. This uses the random walk example from Daniel Schiffman's book the Nature of Code.

Middle

A Div to Locate the Sketch

The id of this div is set in the p5.js setup function as the parent of the sketch.

<script language="javascript" type="text/javascript" src="walker.js"></script>
<div id="random-walk-container">
</div>

Note: Originally this wasn't working, because I had the line to include the javascript inside the div to hold the canvas. Make sure that div is always empty.

The Javascript

let sketch = function(p) {
    p.setup = function() {
        let parent_div_id = "random-walk-container";
        this.canvas = p.createCanvas($("#" + parent_div_id).outerWidth(true), 300);
        this.canvas.parent();
        p.walker = new Walker(p);
    }

    p.draw = function() {
        p.background(255);
        p.walker.walk();
        p.walker.display();
    }
};

function Walker(p) {
  this.x = p.width/2;
  this.y = p.height/2;

  this.walk = function() {
    this.x = this.x + p.random(-1, 1) * 10;
    this.y = this.y + p.random(-1, 1) * 10;
  }

  this.display = function() {
    p.fill(0);
    p.ellipse(this.x, this.y, 48, 48);
  }
}

//let node = document.getElementById("random-walk")
//window.document.getElementsByTagName("body")[0].appendChild(node);
sketch_container = new p5(sketch, 'random-walk-container');

End

As always, this was way harder than it should have been.

The Origin of Bayes Theorem

I'm reading "The theory that would not die" and these are notes I took from them. The book didn't really give me a clear idea about what Price's argument was so I also read a Quartz article about that part of the story and, of course, Wikipedia came into it at some points.

A Brief Sketch of The Timelines to Bayes' Theorem

The Equations

Since it's hard to write out the equations in bullet points I'm going to write some simple versions here.

Bayes' Formulation

"The theory that would not die" notes that Bayes' didn't write out an equation, but it can be written out something like this. \[ P(\textit{cause}|\textit{effect}) = \frac{P(\textit{effect}|\textit{cause}) P(\textit{cause})}{P(\textit{effect})} \]

Laplace's First Version

Originally Laplace didn't have the prior's in his equation (I'll substitute C for cause, E for effect and C' for not our theorized cause). \[ P(C|E) = \frac{P(E|C)}{\sum P(E|C')} \]

Laplace's Final Version

\[ P(C|E) = \frac{P(E|C)P_{\textit{prior}}(C)}{\sum P(E|C') P_{\textit{prior}} (C')} \]

Sources

  1. McGrayne SB. The theory that would not die: how Bayes’ rule cracked the enigma code, hunted down Russian submarines, and emerged triumphant from two centuries of controversy. paperback ed. New Haven, Conn.: Yale University Press; 2011. 336 p.
  2. Kopf D. The most important formula in data science was first used to prove the existence of God [Internet]. Quartz. [cited 2019 May 12]. Available from: https://qz.com/1315731/the-most-important-formula-in-data-science-was-first-used-to-prove-the-existence-of-god/