Setting up pytest

Cloistered Monkey

2023-07-02 15:28

What is this about?

This is a post to generate and document the pytest.ini configuration for testing.

Requirements

The Neurotic Repository

To make the neurotic code importable you need to install the package (in development mode, since it's (hopefully) always changing) using pip.

pip install --editable .

The Testing Dependencies

There are some things that need to be installed to get the code to run (like numpy and nltk) but for the testing this is what's needed.

expects
faker
pytest
pytest-bdd
pytest-mock
pytest-xdist

This list is in a text-file (tests/testing-requriements) so they can be installed by changing into the tests folder and running pip.

pip install --requirement testing-requirements

The Configuration File

Pytest can read in an INI formatted file. It has a bunch of configuration options but I haven't really explored them much. I mostly use it to set the command line arguments that I would otherwise pass in.

The Header

This is just the pytest section header so pytest knows to look here.

[pytest]

The Pytest BDD Features Base

Pytest-bdd will look for a feature file in the same folder as the test-file, unless you tell it not to, so let's tell it to look in the sub-folder named "features" instead.

bdd_features_base_dir = features/

The Regular Testing Run

addopts = --exitfirst --failed-first --color yes --gherkin-terminal-reporter
          --looponfail --numprocesses auto

I didn't realize it before, but as this stackoverflow answer mentions, you can break up long line in INI files (assuming that the code is using python's ConfigParser) by indenting the continuation lines (although it appears, unfortunately, to break pygments' syntax highlighting).

Here's a breakdown of the options.

exitfirst: stop testing after the first failed test
failed-first: when re-starting the tests, start with the test that failed
color yes: Add some highlight-coloring to the pytest output
gherkin-terminal-reporter: Format the output using pytest-bdd
looponfail: re-run the testing if a file changes
numprocesses auto: Run the tests in parallel across all available cores.

The numprocesses option can alternatively take logical as an argument, meaning use all logical CPUs, not physical cores (this requires you install pytest-xdist[psutil]) or the actual number of processes you want to run in parallel.

The Run-Once

This removes the automatic re-running of tests. I don't really use this, but sometimes it can be helpful, especially when using something like Selenium that slows everything down a lot, to not re-run the test every time you save a file.

# addopts = --exitfirst --failed-first --color yes --gherkin-terminal-reporter --numprocesses auto

The PUDB Run

pytest will grab standard out by default, making it impossible to run an interactive debugger. They have support for python's pdb debugger, but I use pudb instead, so this set of arguments turns off the capturing of standard out which will let you run the tests with PUDB. There is a project that integrates pudb into pytest, but it appears to have died out, so I'll just stick to my old way of doing it.

# addopts = --exitfirst --failed-first --color yes --gherkin-terminal-reporter --capture=no

DuckDuckGo Image Search

Cloistered Monkey

2022-11-21 15:19

DuckDuckGo Image Search

This is a post for some notes on the ddg_images function from the duckduckgo-search library (link to GitHub) which downloads images using duckduckgo (and thus bing in a way).

The Parameters

Let's start by looking at the arguments that the function takes.

from duckduckgo_search import ddg_images

print(ddg_images.__doc__)

DuckDuckGo images search. Query params: https://duckduckgo.com/params

    Args:
        keywords (str): keywords for query.
        region (str, optional): wt-wt, us-en, uk-en, ru-ru, etc. Defaults to "wt-wt".
        safesearch (str, optional): On, Moderate, Off. Defaults to "Moderate".
        time (Optional[str], optional): Day, Week, Month, Year. Defaults to None.
        size (Optional[str], optional): Small, Medium, Large, Wallpaper. Defaults to None.
        color (Optional[str], optional): color, Monochrome, Red, Orange, Yellow, Green, Blue,
            Purple, Pink, Brown, Black, Gray, Teal, White. Defaults to None.
        type_image (Optional[str], optional): photo, clipart, gif, transparent, line.
            Defaults to None.
        layout (Optional[str], optional): Square, Tall, Wide. Defaults to None.
        license_image (Optional[str], optional): any (All Creative Commons), Public (PublicDomain),
            Share (Free to Share and Use), ShareCommercially (Free to Share and Use Commercially),
            Modify (Free to Modify, Share, and Use), ModifyCommercially (Free to Modify, Share, and
            Use Commercially). Defaults to None.
        max_results (int, optional): maximum number of results, max=1000. Defaults to 100.
        output (Optional[str], optional): csv, json, print. Defaults to None.
        download (bool, optional): if True, download and save images to 'keywords' folder.
            Defaults to False.

    Returns:
        Optional[List[dict]]: DuckDuckGo text search results.

Hopefully the arguments are pretty straight forward. I couldn't find an official help page for image searches but they seem pretty straight-forward.

The Output

Let's do a search for images of lop-eared rabbits and then take a look at what ddg_images returns.

output = ddg_images("rabbit lop", type_image="photo",
                    license_image="public", 
                    max_results=1)

pprint(output[0])

{'height': 848,
 'image': 'https://cdn.pixabay.com/photo/2015/12/22/20/27/animals-1104748_1280.jpg',
 'source': 'Bing',
 'thumbnail': 'https://tse1.mm.bing.net/th?id=OIP.lEqFD_LPmRGbc1GyIUoNygHaE6&pid=Api',
 'title': 'Flemish Lop Rabbit Very · Free photo on Pixabay',
 'url': 'https://pixabay.com/en/flemish-lop-rabbit-rabbit-1104748/',
 'width': 1280}

I thought that the output argument would change the format of the returned values but it instead seems to direct the values to a file (counting stdout as a file) and then return the same function output as before. Here's what "print" does'.

output = ddg_images("rabbit lop", type_image="photo",
                    license_image="public",
                    output="print",
                    max_results=1)

print(output[0])

1. {
    "title": "Flemish Lop Rabbit Very · Free photo on Pixabay",
    "image": "https://cdn.pixabay.com/photo/2015/12/22/20/27/animals-1104748_1280.jpg",
    "thumbnail": "https://tse1.mm.bing.net/th?id=OIP.lEqFD_LPmRGbc1GyIUoNygHaE6&pid=Api",
    "url": "https://pixabay.com/en/flemish-lop-rabbit-rabbit-1104748/",
    "height": 848,
    "width": 1280,
    "source": "Bing"
}
{'title': 'Flemish Lop Rabbit Very · Free photo on Pixabay', 'image': 'https://cdn.pixabay.com/photo/2015/12/22/20/27/animals-1104748_1280.jpg', 'thumbnail': 'https://tse1.mm.bing.net/th?id=OIP.lEqFD_LPmRGbc1GyIUoNygHaE6&pid=Api', 'url': 'https://pixabay.com/en/flemish-lop-rabbit-rabbit-1104748/', 'height': 848, 'width': 1280, 'source': 'Bing'}

Under The Hood

I was hoping that I'd be able to link to some official documentation to explain what's going on, but from what I can tell duckduckgo doesn't have an advertised API for its image search, and duckduckgo-search isn't too heavily documented, but if you look at the duckduckgo-search code it appears to be using a special token (vqd) that you can use to make queries to a special endpoint (https://www.duckduckgo.com/i.js) that you can use to get the search results from, which I couldn't find documented in the duckduckgo-search repository but is mentioned in this StackOverflow answer. I don't know how they figured it out, but using the vqd parameter and o=json makes it work pretty much like an API, although the code is also handing pagination so it seems almost like a hybrid web-scraping and API request.

The VQD

I'll show a request for "rabbit lop" using URL parameters. The actual code is using the requests library and sends them as a payload dictionary instead, but I thought it might be more familiar to see them as URL parameters (and it makes it so you can copy and paste the output into a browser to see what happens).

The first thing that ddg_images does is make a POST request to "https://duckduckgo.com/" using the search terms as data ({q="rabbit lop"}).

This returns some HTML, within which is some javascript that contains the "vqd" value that we need to pass in as an argument to make the proper search query.

<script type="text/JavaScript">
  function nrji() {
    nrj('/t.js?q=rabbit%20lop&l=us-en&s=0&dl=en&ct=US&ss_mkt=us&p_ent=&ex=-1&dfrsp=1')
    DDG.deep.initialize('/d.js?q=rabbit%20lop&l=us-en&s=0&dl=en&ct=US&ss_mkt=us&vqd=3-175223187338608511244788076450682226312-294342034741290420994864096532891255767&p_ent=&ex=-1&sp=1&dfrsp=1');;
  }
  DDG.ready(nrji, 1)
</script>

The "vqd" is one of the arguments to the DDG.deep.initialize function in the javascript/HTML that's returned from the request and the duckduckgo-search code extracts it using substring searching, giving us just the vqd.

vqd=3-175223187338608511244788076450682226312-294342034741290420994864096532891255767

Then a second request is made using the VQD token as one of the parameters along with whatever other parameters you want - in this case we're setting the region (l=wt-wt meaning "no region"), asking that the response be JSON (o=json), using the keywords "rabbit lop" (q="rabbit lop"), and turning safe-search off (p=-1). To make it work you also need to send the request to a special endpoint https://duckduckgo.com/i.js. So we end up with a request that looks like this:

https://duckduckgo.com/i.js?q=rabbit+lop&o=json&l=wt-wt&s=0&f=%2C%2C%2Ctype%3Aphoto%2C%2Clicense%3Apublic&p=-1&vqd=3-175223187338608511244788076450682226312-294342034741290420994864096532891255767

The payload to the response is a JSON blob that gets converted by the python code into a dictionary (if you paste the example url into a browser address bar you should be able to see the response).

Python Translation

The previous section was my attempt to explain to myself more or less how the code works, but I thought it might be easier to understand if we steal some of the code from duckduckgo-search and modify it to get a single output more or less the way ddg_images does it..

First we set up our requests session.

import requests

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Firefox/102.0",
    "Referer": "https://duckduckgo.com/",
}

SESSION = requests.Session()
SESSION.headers.update(HEADERS)

Now we get the vqd by doing a POST and then searching the returned HTML for the string.

payload = dict(q="rabbit lop")

response = SESSION.post("https://duckduckgo.com", data=payload, timeout=10)

PREFIX = b"vqd='"

vqd_index_start = response.content.index(PREFIX) + len(PREFIX)
vqd_index_end = response.content.index(b"'", start=vqd_index_start)

vqd_bytes = response.content[vqd_index_start:vqd_index_end]

# convert it from bytes back to a string before using it as a payload value
vqd = vqd_bytes.decode()
print(vqd)

3-175223187338608511244788076450682226312-294342034741290420994864096532891255767

Now that we have the VQD (whatever it is) we can make our actual search request by building up the payload dictionary and sending the request to duckduckgo.

payload["o"] = "json"
payload["l"] = "wt-wt"
payload["s"] = 0
payload["f"] = ",,,type:photo,,license:public"
payload["p"] = -1
payload["vqd"] = vqd

response = SESSION.get("https://duckduckgo.com/i.js", params=payload)

The response contains multiple search results but let's unpack the first one as a demonstration.

page_data = response.json()["results"]

row = page_data[0]

output = {
    "title": row["title"],
    "image": row["image"],
    "thumbnail": row["thumbnail"],
    "url": row["url"],
    "height": row["height"],
    "width": row["width"],
    "source": row["source"],
}

SESSION.close()
pprint(output)

{'height': 848,
 'image': 'https://cdn.pixabay.com/photo/2015/12/22/20/27/animals-1104748_1280.jpg',
 'source': 'Bing',
 'thumbnail': 'https://tse1.mm.bing.net/th?id=OIP.lEqFD_LPmRGbc1GyIUoNygHaE6&pid=Api',
 'title': 'Flemish Lop Rabbit Very · Free photo on Pixabay',
 'url': 'https://pixabay.com/en/flemish-lop-rabbit-rabbit-1104748/',
 'width': 1280}

The ddg_images function is doing more than this, but for future reference, here's the basics of what's going on and how the author made the search work.

Sources

FastAI: Picking the Best Model

Cloistered Monkey

2022-11-12 20:41

In the Beginning

In this notebook we'll go over the fastai course lesson 3 - "Which image models are best?". We'll use the benchmarking data from timm, a collection of pyTorch IMage Models to compare how different computer vision models performed using time-per-image and accuracy as our metrics.

Imports and Setup

# from python
from functools import partial
from pathlib import Path

# from pypi
from tabulate import tabulate

import altair
import pandas

# monkey
from graeae.visualization.altair_helpers import output_path, save_chart

TABLE = partial(tabulate, tablefmt="orgtbl", headers=["Column", "Value"] )

PLOT_WIDTH, PLOT_HEIGHT = 900, 600
SLUG = "fastai-picking-the-best-model"
OUTPUT_PATH = output_path(SLUG)
save_it = partial(save_chart, output_path=OUTPUT_PATH)

The Validation Data

We'll be using data that's part of the git repository for timm . Once you clone the repository the first file within it that we want will be results/results-imagenet.csv. This is the result of using the Imagenet Validation set to validate the models.

RESULTS = Path("~/projects/third-party/"
               "pytorch-image-models/results").expanduser()
DATA = RESULTS/"results-imagenet.csv"
validation = pandas.read_csv(DATA)

print(TABLE(validation.iloc[0].to_frame()))

Column	Value
model	beit_large_patch16_512
top1	88.602
top1_er r	11.398
top5	98.656
top5_err	1.344
param_count	305.67
img_size	512
crop_pct	1.0
interpolation	bicubic

This table shows the first row of the results-imagenet CSV. Each row represents a computer vision model and some information about how it performed during validation. The documentation says that top1 and top5 are "top-1/top-5 differences from clean validation." Which means… what? Looking at the validate.py file it appears that top1 and top5 are measures of accuracy. Looking in the utils.metrics.py module the function accuracy has a docstring that says: Computes the accuracy over the k top predictions for the specified values of k. The top1 and top5 are AverageMeter objects that keep a running average of their accuracies.

This seems straightforward enough, but if you look at that first row the top1 is smaller than the top5 and has a larger error…

Guessing by the name, the model in our row is an instance of "BEIT: BERT Pre-Training of Image Transformers (https://arxiv.org/abs/2106.08254)" found in timm's beit.py module.

print(validation.shape)

(668, 9)

The model column is the string you use when creating a model and also refers to a function in one of the pytorch-image-models/timm/models modules. If you want to see how the model in our example row is defined, look in the timm/models/beit.py module for a function named "beit_large_patch16_512". You should find something like this.

@register_model
def beit_large_patch16_512(pretrained=False, **kwargs):
    model_kwargs = dict(
        img_size=512, patch_size=16, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4, qkv_bias=True,
        use_abs_pos_emb=False, use_rel_pos_bias=True, init_values=1e-5, **kwargs)
    model = _create_beit('beit_large_patch16_512', pretrained=pretrained, **model_kwargs)
    return model

So we can now see that besides being a BEIT model the name tells us that it used an image size of 512 and a patch size of 16. Further up the file is this configuration:

'beit_large_patch16_512': _cfg(
    url='https://conversationhub.blob.core.windows.net/beit-share-public/beit/beit_large_patch16_512_pt22k_ft22kto1k.pth',
        input_size=(3, 512, 512), crop_pct=1.0,

Which tells you where the pretrained weights came from.

The Benchmark Data

We're going to merge our "validation" data with two "benchmark" files (also in the "results" folder) doing some cryptic filtering and data wrangling. It's not obvious what everything is doing so let's use it first and maybe figure out most of it later. The main things to note is that we're adding a family column made by taking the first token from the model name (e.g. the model beit_large_patch16_512 gets the family beit), we're adding a secs column by inverting the samples-per-second column, and filtering the models down to a subset that are useful to look at.

BENCHMARK_FILE = ("benchmark-{infer_or_train}"
                  "-amp-nhwc-pt111-cu113-rtx3090.csv")
SAMPLE_RATE = "{infer_or_train}_samples_per_sec"
FAMILY_REGEX = r'^([a-z]+?(?:v2)?)(?:\d|_|$)'
FAMILY_FILTER = r'^re[sg]netd?|beit|convnext|levit|efficient|vit|vgg'

def get_data(infer_or_train: str,
             validation: pandas.DataFrame=validation) -> pandas.DataFrame:
    """Load a benchmark dataframe

    Args:
     infer_or_train: part of filename with label (infer or train)
     validation: DataFrame created from validation results file

    Returns:
     benchmark data merged with validation
    """

    frame = pandas.read_csv(
        RESULTS/BENCHMARK_FILE.format(
            infer_or_train=infer_or_train)).merge(
        validation, on='model')
    frame['secs'] = 1. / frame[SAMPLE_RATE.format(infer_or_train=infer_or_train)]
    frame['family'] = frame.model.str.extract(FAMILY_REGEX)
    frame = frame[~frame.model.str.endswith('gn')]
    IN_FILTERER = frame.model.str.contains('in22'), "family"
    frame.loc[IN_FILTERER] = frame.loc[IN_FILTERER] + '_in22'

    RESNET_FILTERER = frame.model.str.contains('resnet.*d'),'family'
    frame.loc[RESNET_FILTERER] = frame.loc[RESNET_FILTERER] + 'd'
    return frame[frame.family.str.contains(FAMILY_FILTER)]

Build The Base Chart

The build_chart function is going to help us build the basic chart to compare the merged validation and benchmark values for the models.

SELECTION = altair.selection_multi(fields=["family"], bind="legend")
COLUMNS = ["secs", "top1", "family", "model"]

def build_chart(frame: pandas.DataFrame, infer_or_train: str,
                add_selection: bool=True) -> altair.Chart:
    """Build the basic chart for our benchmarks

    Note:
     the ``add_selection`` function can only be called once on a chart so to
     add more layers don't add it here, add it later to the end

    Args:
     frame: benchmark frame to plot
     infer_or_train: which image size column (infer | train)
     add_selection: whether to add the selection at the end
    """
    # altair includes all the data even if it's not used in the plot
    # reducing the dataframe to just the data you need
    # makes the file smaller
    SIZE = f"{infer_or_train}_img_size"
    frame = frame[COLUMNS + [SIZE]]
    chart = altair.Chart(frame).mark_circle().encode(
        x=altair.X("secs", scale=altair.Scale(type="log"),
                   axis=altair.Axis(title="Seconds Per Image (log)")),
        y=altair.Y("top1",
                   scale=altair.Scale(zero=False),
                   axis=altair.Axis(title="Imagenet Accuracy")),
        size=altair.Size(SIZE,
                         scale=altair.Scale(
                             type="pow", exponent=2)),
        color="family",
        tooltip=[altair.Tooltip("family", title="Architecture Family"),
                 altair.Tooltip("model", title="Model"),
                 altair.Tooltip(SIZE, format=",", title="Image Size"),
                 altair.Tooltip("top1", title="Accuracy"),
                 altair.Tooltip("secs", title="Time (sec)", format=".2e")
                 ]
        )
    if add_selection:
        chart = chart.encode(opacity=altair.condition(
                SELECTION,
                altair.value(1),
                altair.value(0.1))
        ).add_selection(SELECTION)
    return chart

Plot All the Architectures

Our first chart for the benchmarking data will plot all the models left in the data-frame after our filtering and merging to show us how they compare for accuracy and average time to process a sample.

def plot_it(frame: pandas.DataFrame,
            title: str,
            filename: str,
            infer_or_train: str,
            width: int=PLOT_WIDTH,
            height: int=PLOT_HEIGHT) -> None:
    """Make an altair plot of the frame

    Args:
     frame: benchmark frame to plot
     title: title to give the plot
     filename: name of file to save the chart to
     infer_or_train: which image size column (infer or train)
     width: width of plot in pixels
     height: height of plot in pixels
    """
    chart = build_chart(frame, infer_or_train).properties(
        title=title,
        width=width,
        height=height,
    )

    save_it(chart, filename)
    return

Plot Some of the Architectures

To make it easier to understand, the author of the fastai lesson chose a subset of the families to plot.

beit
convnext
efficientnetv2
levit
regnetx
resnetd
vgg

Note: The fastai notebook points out that because of the different sample sizes used to train the models it isn't a simple case of picking the "best" performing model (given a speed vs accuracy trade off). The pytorch-image-models repository has information to help research what went into the training.

FAMILIES = 'levit|resnetd?|regnetx|vgg|convnext.*|efficientnetv2|beit'

def subset_regression(frame: pandas.DataFrame,
                      title: str,
                      filename: str,
                      infer_or_train: str,
                      width: int=PLOT_WIDTH,
                      height: int=PLOT_HEIGHT) -> None:
    """Plot subset of model-families

    Args:
     frame: frame with benchmark data
     title: title to give the plot
     filename: name to save the file
     infer_or_train: which image size column
     width: width of plot in pixels
     height: height of plot in pixels
    """
    subset = frame[frame.family.str.fullmatch(FAMILIES)]

    base = build_chart(subset, infer_or_train, add_selection=False)

    line = base.transform_regression(
        "secs", "top1",
        groupby=["family"],
        method="log",
        ).mark_line().encode(
            opacity=altair.condition(
                SELECTION,
                altair.value(1),
                altair.value(0.1)
            ))

    chart = base.encode(
        opacity=altair.condition(
        SELECTION,
        altair.value(1),
        altair.value(0.1)
    ))

    chart = altair.layer(chart, line).properties(
        title=title,
        width=width,
        height=height,
    ).add_selection(SELECTION)

    save_it(chart, filename)
    return

Inference

The first benchmarking data we're going to add is the inference data. Unfortunately I haven't been able to find out what this means, exactly - was this a test of categorizing a test set? It only adds the average sample time to what we're going to plot, which perhaps isn't as interesting as the accuracy anyway.

inference = get_data('infer')
print(TABLE(inference.iloc[0].to_frame()))

Column	Value
model	levit_128s
infer_samples_per_sec	21485.8
infer_step_time	47.648
infer_batch_size	1024
infer_img_size	224
param_count_x	7.78
top1	76.514
top1_err	23.486
top5	92.87
top5_err	7.13
param_count_y	7.78
img_size	224
crop_pct	0.9
interpolation	bicubic
secs	4.654236751715086e-05
family	levit

Let's look at a row of what was added to our original validation data.

added = inference[list(set(inference.columns) - set(validation.columns))].iloc[0]
print(TABLE(added.to_frame()))

Column	Value
secs	4.654236751715086e-05
family	levit
param_count_y	7.78
infer_batch_size	1024
param_count_x	7.78
infer_samples_per_sec	21485.8
infer_step_time	47.648
infer_img_size	224

If you look back at get_data you'll see that we added the sec column which is defined as $\frac{1}{\textit{samples per second}}$. So it's the averaged(?) seconds per sample. I think.

Let's see how evenly distributed the families are.

counts = inference.family.value_counts().to_frame().reset_index().rename(
    columns = {"index": "Family", "family": "Count"})

chart = altair.Chart(counts).mark_bar().encode(
    x="Count", y=altair.Y("Family", sort="-x"), tooltip=["Count"],
).properties(
    width=PLOT_WIDTH,
    height=PLOT_HEIGHT,
    title="Inference Family Counts"
)

save_it(chart, "inference-family-counts")

There doesn't seem to be an even representation of model families. Let's look at the accuracy vs the speed for the models.

plot_it(inference, title="Inference", 
        filename="inference-benchmark",
        infer_or_train="infer")

While we still don't have an explanation of exactly what we're looking at, in the broadest it's a plot of the time it takes for a model to process an image (in seconds on a logarithmic scale) versus the accuracy when categorizing the Imagenet dataset.

The color matches the family in the legend.
The size is proportional to the number of seconds it took.
Clicking on a family in the legend will highlight it and suppress the other families.
Hovering over a circle gives the exact information for that point.

I believe that the accuracy is the best performance for a model, so even though a family might have multiple points in the plot, each model will only have one point to represent its best accuracy and the time it took.

A Subset

To make it easier to see what's going on the author(s) of the fastai lesson paired down the dataset to a subset of families and then added regression lines to compare them.

subset_regression(inference,
                  title="Inference Subset",
                  filename="inference-subset-benchmark",
                  infer_or_train="infer")

Training

training = get_data("train")
plot_it(training, title="Training", 
        filename="training-benchmark",
        infer_or_train="train")

subset_regression(training,
                  title="Training Subset",
                  filename="training-subset-benchmark",
                  infer_or_train="train")

Parameters Vs Time

The fastai notebook plots the model parameters vs time (speed), saying that parameters are sometimes used as a proxy for speed and memory use (to make it machine independent, presumably), but then says that it isn't always a good proxy. Once more they give us a tool and then tell us it isn't necessarily what to use.

plotter = inference[["param_count_x", "secs", "infer_img_size", "family", "model", "top1"]]
chart = altair.Chart(plotter).mark_circle().encode(
    x=altair.X("param_count_x", scale=altair.Scale(type="log"),
               axis=altair.Axis(title="Parameters (log)")),
    y=altair.Y("secs", scale=altair.Scale(type="log", zero=False),
               axis=altair.Axis(title="Seconds Per Image (log)")),
    color="infer_img_size",
    tooltip=[altair.Tooltip("family", title="Architecture Family"),
             altair.Tooltip("model", title="Model"),
             altair.Tooltip("infer_img_size", format=",", title="Image Size"),
             altair.Tooltip("top1", title="Accuracy"),
             altair.Tooltip("secs", title="Time (sec)", format=".2e")
             ],
    opacity=altair.condition(
        SELECTION,
        altair.value(1),
        altair.value(0.1))
).add_selection(SELECTION).properties(
    title="Parameters Vs Time",
    width=PLOT_WIDTH,
    height=PLOT_HEIGHT-100)

save_it(chart, "inference-parameters-vs-time")

In this case it looks like parameters and speed are correlated, as it takes more time the more parameters there are, but it's confounded by the fact that the models with more parameters seem to be handling bigger images.

Accuracy Vs Size

The fastai

plotter = inference[["param_count_x", "img_size",
                     "family", "model", "secs", "top1"]]
chart = altair.Chart(plotter).mark_circle().encode(
    x=altair.X("img_size", scale=altair.Scale(zero=False),
               axis=altair.Axis(title="Image Size")),
    y=altair.Y("top1",
               scale=altair.Scale(zero=False),
               axis=altair.Axis(title="Accuracy")),
    size=altair.Size("secs", scale=altair.Scale(type="log")),
    color="family",
    tooltip=[altair.Tooltip("family", title="Architecture Family"),
             altair.Tooltip("model", title="Model"),
             altair.Tooltip("img_size", format=",", title="Image Size"),
             altair.Tooltip("top1", title="Accuracy"),
             altair.Tooltip("secs", title="Time (sec)", format=".2e")
             ],
    opacity=altair.condition(
        SELECTION,
        altair.value(1),
        altair.value(0.1))
).add_selection(SELECTION).properties(
    title="Accuracy Vs Image Size",
    width=PLOT_WIDTH,
    height=PLOT_HEIGHT-100)

save_it(chart, "inference-accuracy-vs-size")

Sources

PyTorch Image Models: Documentation for the timm pre-built computer vision models for pytorch.
Pytorch Image Models on github: Repository for timm.
timm on paperswithcode.com: Table of timm models showing what dataset was used for training and links to publications about each model, and links to a detail page for each model.
README for the timm results folder on GitHub.

FastAI: Saving a Model

Cloistered Monkey

2022-11-12 17:37

Redoing The Cats and Dogs

# python
from pathlib import Path

#fastai
from fastai.vision.all import (
    error_rate,
    get_image_files,
    ImageDataLoaders,
    load_learner,
    Resize,
    resnet34,
    untar_data,
    URLs,
    vision_learner,
)

path = untar_data(URLs.PETS)/'images'

def its_a_cat(filename: str) -> bool:
    """Checks if the filename looks like a cat

    Args:
     filename: name of the image file

    Returns:
     True if the first letter of the filename is upper-cased
    """
    return "cat" if filename[0].isupper() else "dog"

loader = ImageDataLoaders.from_name_func(
    path,
    get_image_files(path),
    valid_pct=0.2,
    seed=42,
    label_func=its_a_cat,
    item_tfms=Resize(224)
)

learner = vision_learner(
    loader, resnet34, metrics=error_rate)

model = learner.to_fp16()
with model.no_bar():
    model.fine_tune(1)

[0, 0.12685242295265198, 0.019196458160877228, 0.00405954010784626, '00:19']
[0, 0.06199510768055916, 0.016171308234333992, 0.0067658997140824795, '00:25']

Saving the Model

You can either save the underlying pytorch model or the fastai Learner. We want the simpler way so we'll save the fastai Learner.

MODEL_PATH = '/tmp/model.pkl'
model.export(MODEL_PATH)

Loading the Model

Weirdly, the original fastai jupyter notebook doesn't tell you how to load the model once you've saved it, but I'm assuming that this is the way to do it.

relearner = load_learner(fname=MODEL_PATH)

def check_model(image_path: str) -> None:
    image_path = Path(image_path).expanduser()

    with relearner.no_bar():
        category, location, probablilities = relearner.predict(image_path)
    print(f"I think this is a {category}.")
    print(f"The probability that it's a {category} is"
          f" {probablilities[location.item()].item():.2f}")
    return

check_model("~/test-cat.jpg")

I think this is a cat.
The probability that it's a cat is 1.00

check_model("~/test-dog.jpg")

I think this is a dog.
The probability that it's a dog is 1.00

Sources

Raccoon Or Raccoon Dog?

Cloistered Monkey

2022-11-07 16:12

What Is This?

This is a run-through of some of the ideas from Lesson 0 of the FastAI Practical Deep Learning for Coders course (sort of, there's a 2022 version that I'm using which doesn't seem to exactly match the lectures on the website). In it we search for photos using a search engine and build a neural-network to classify the images that belong to one of the two classes of photos that we use. This is an image classification example, like the Cats vs Dogs post but it has the added feature of demonstrating how to build your own dataset using a search-engine. I'll be using Tanuki (the Japanese Raccoon Dog) and Raccoon images as the categories to classify.

Imports

For the search engine we'll use DuckDuckGo via the duckduckgo-search package (from pypi) and its ddg_images function..

# python
from functools import partial
from pathlib import Path
from time import sleep

import os, warnings

# pypi
from duckduckgo_search import ddg_images
from dotenv import load_dotenv

import torch

# fastai
from fastai.data.all import (
    CategoryBlock,
    DataBlock,
    parent_label,
    RandomSplitter,
)
from fastai.vision.all import (
    download_images,
    get_image_files,
    ImageBlock,
    Resize,
    resnet18,
    resize_images,
    verify_images,
    vision_learner,
    error_rate,
    PILImage,
)

from fastcore.net import urlsave

# monkey shines
from graeae import Timer

TIMER = Timer()
load_dotenv()

DATA_PATH = Path(os.environ["FASTAI_DATA"])/"raccoon-vs-tanuki"
assert torch.cuda.is_available()

Note: The DATA_PATH is where we're going to store the images we download. We are going to use a function (parent_label) that uses the folders within this directory to label the images within the folders (so images in a folder named "herbert" will be labeled "herbert"). This means that it should only have the folders that we are going to use to build the model. I originally set it to the fastai root root data path which then made the data loader think that all the other data folders were labels as well, so I created a sub-folder named "raccoon-vs-tanuki" to isolate the images I need to train the model.

Getting the Images

We're going to create an alias for the ddg_images function to make the search and then return only the URLs of the images (or their thumbnails) that DuckDuckGo finds.

def get_image_urls(keywords: str,
                   max_images: int=200,
                   license_image: str="any",
                   key="image") -> list:
    """Search duckduckgo images

    Args:
     keywords: A string with keywords to give to duckduckgo
     max_images: the upper limit for how many images to return

    Returns:
     a list-like object with the URLs of the images found
    """
    return [output.get(key) for output in 
            ddg_images(keywords,
                       type_image="photo",
                       license_image=license_image,
                       max_results=max_images)]

A Test Of Tanuki

We'll start by checking that our searcher is working using the keywords "tanuki" and "racoon". First, what does ddg_images return when we search for tanuki?

o = ddg_images("tanuki", type_image="photo", max_results=1)
print(o)

[{'title': 'Tanuki | Animal Jam Fanon Wiki | Fandom', 'image': 'https://vignette.wikia.nocookie.net/ajfanideas/images/6/6a/God_damnit.png/revision/latest?cb=20190222141158', 'thumbnail': 'https://tse3.mm.bing.net/th?id=OIP.74LPltCuN75QxFq2RHLhywHaFj&pid=Api', 'url': 'https://ajfanideas.fandom.com/wiki/Tanuki', 'height': 1200, 'width': 1600, 'source': 'Bing'}]

So it looks like it returns a list of json/dict objects. I'll print it out in a table to maybe make it easier to see. First, the title contains 'pipes' that break the table so I'll replace them with dashes.

o[0]["title"] = o[0]["title"].replace("|", ",")

Now the table.

print("|Key | Value|")
print("|-+-|")
for key, value in o[0].items():
    print(f"|{key}| {value}|")

Key	Value
title	Tanuki , Animal Jam Fanon Wiki , Fandom
image
thumbnail	https://tse3.mm.bing.net/th?id=OIP.74LPltCuN75QxFq2RHLhywHaFj&pid=Api
url	https://ajfanideas.fandom.com/wiki/Tanuki
height	1200
width	1600
source	Bing

Looking at the image URL you might not guess that it was an image of a tanuki (is the tanuki named "God damnit"?), but the title suggests that it is. Interestingly, if you follow the URL to the page where the image comes from you'll see that it's a wiki dedicated to a game called "Animal Jam" but the author of the page says that they couldn't find an image of the tanuki from the game so it is, indeed, a photo of a real tanuki, not a game character.

That's the output of ddg_images but we created get_image_urls to make it a little simpler to get just the URLs so let's search for "tanuki" images again but this time I'm going to download and show the image so I'll specify that I want to pull the image from the Public Domain.

TANUKI_URLS = get_image_urls("tanuki", max_images=1, license_image="Public")
print(TANUKI_URLS[0])

https://c.pxhere.com/photos/04/ff/animal_marten_raccoon_dog_tanuki_enok_obstfuchs_omnivore_fur-705793.jpg!d

The images are usually pretty big so let's download a thumbnail of the image and take a look at it to make sure we're getting the image we expect. The original fastai notebook uses the fastai download_url function (which appears to come from another fastai project called fastdownload (github, documentation)) but, it looks like all this function is doing is starting a progress bar (which I can't use here) and then calling urlsave from another fastai library called fastcore (documentation) so I'll use urlsave instead.

THUMBS = get_image_urls("tanuki", max_images=1, license_image="Public", key="thumbnail")
TANUKI_OUTPUT = "/tmp/tanuki_thumb.jpg"
urlsave(url=THUMBS[0], dest=TANUKI_OUTPUT)

And Here's the thumbnail we downloaded.

Seems to work. One disadvantage to using the get_image_urls to alias the ddg_images function is that we end up throwing away the other information, so to get the source URL to see the page where the image comes from we have to make another function call.

print(get_image_urls("tanuki", max_images=1, license_image="Public", key="url")[0])

https://pxhere.com/en/photo/705793

This image comes from pxhere.com which appears to be a public domain image hosting site.

Now for the raccoon.

RACCOON_OUTPUT = "/tmp/raccoon_thumb.jpg"
RACCOON_URLS = get_image_urls("raccoon",
                             max_images=1,
                             license_image="Public", key="thumbnail")

urlsave(url=RACCOON_URLS[0],
        dest=RACCOON_OUTPUT)

print(get_image_urls("raccoon",
                     max_images=1,
                     license_image="Public", key="url")[0])

http://www.publicdomainpictures.net/view-image.php?image=33712&picture=raccoon-4&large=1

Build A Data Set

Now that we've done a little check of what our function does we can move on to creating our dataset using it. When you download an archived dataset from fastai it saves it to the ~/.fastai/ directory, so I'll put this dataset there too. I'll use fastai's download_images function to do the actual downloading.

print(download_images.__doc__)

Download images listed in text file `url_file` to path `dest`, at most `max_pics`

These are the arguments it takes.

Argument	Meaning	Default
dest	Folder Path to save files to	None (required)
url_file	Text file with one URL per line to use as source	None (only used if `urls` is None)
urls	Iterable collection of URLs to download	None
max_pics	Limit on the number of images to download	1000
n_workers	Number of parallel threads to use	8
timeout	Seconds to allow for a download	4
preserve_filename	Whether to use the filename in the URL	False

We'll add two extra keywords - "sun" and "shade" to the search to hopefully get images that match those conditions and between each search query I'll put in a sleep so that we aren't hitting the server too hard. We'll also use fastai's resize_images to make sure that none of the images are too big. The argument max_size gives the maximum number of pixels either dimension (height or width) can have.

def download_and_resize(destination: Path, search_terms: str, max_size: int=400) -> None:
    """Download images and resize them

    Args:
     destination: path to parent folder
     search_terms: keywords to use to search for images
     max_size: maximum size for height and width of images
    """
    download_images(
        dest=destination,
        urls=get_image_urls(SEARCH_TERMS)
    )

    resize_images(path=destination,
                  max_size=max_size,
                  dest=destination)
    return

The path argument is the source of the images and the dest is where you want to put the resized images. Normally I don't suppose you'd want to remove the original images, but in this case I do so they're set to the same folder.

ANIMALS = ("tanuki", "raccoon")

PAUSE = 10
PAUSE_BETWEEN_SEARCHES = partial(sleep, PAUSE)
CONDITIONS = tuple(("", "sun ", "shade "))

print(f"Estimated Run Time: {len(CONDITIONS) * len(ANIMALS) * PAUSE + 15} seconds")

with TIMER:
    print("Searching for:")
    for animal in ANIMALS:
        destination = DATA_PATH/animal
        destination.mkdir(exist_ok=True, parents=True)

        for condition in CONDITIONS:
            SEARCH_TERMS = f"{animal} {condition}"
            print(f" - '{SEARCH_TERMS}'")
            download_and_resize(destination, SEARCH_TERMS)
            PAUSE_BETWEEN_SEARCHES()

Estimated Run Time: 75 seconds
Started: 2022-12-08 18:24:00.371278
Searching for:
 - 'tanuki '
 - 'tanuki sun '
 - 'tanuki shade '
 - 'raccoon '
 - 'raccoon sun '
 - 'raccoon shade '
Ended: 2022-12-08 18:27:07.360951
Elapsed: 0:03:06.989673

Verify the Dataset

Some of the images might be invalid for whatever reason, we'll use a fastai builtin function (verify_images) to check them and Path.unlnk to delete the files that were deemed invalid. verify_images works by trying to open each file as an image. It adds some parallelism to speed it up but isn't doing anything fancy and, depending on how many files you have and their size, might take a little while to cmoplete.

with TIMER:
    failed = verify_images(get_image_files(DATA_PATH))
    failed.map(Path.unlink)
print(f"{len(failed)} images were deemed failures.")

Started: 2022-12-08 18:28:17.277347
Ended: 2022-12-08 18:28:22.806488
Elapsed: 0:00:05.529141
18 images were deemed failures.

Training the Model

loaders = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=[Resize(192, method='squish')]
).dataloaders(DATA_PATH)

Parameter	Argument	Description
blocks	(`ImageBlock, CategoryBlock`)	Defines the inputs as images and outputs as categories
get_items	`get_image_files`	A function to search for image files.
splitter	`RandomSplitter`	A class to split the data into training (80%) and validation (20%)
get_y	`parent_label`	A function that grabs the name of the folder to use as an image label.
item_tfms	`Resize`	Resize all the images to a uniform size (192 x 192) by squishing them.

And now we train the categorizer.

with warnings.catch_warnings() as catcher:
    warnings.simplefilter("ignore")
    learner = vision_learner(loaders, resnet18, metrics=error_rate)

with learner.no_bar() as nobar, Timer() as timmy:
    learner.fine_tune(3)

Started: 2022-12-08 18:28:55.158226
[0, 0.29742079973220825, 0.0608036108314991, 0.020555555820465088, '00:12']
[0, 0.05481018126010895, 0.0367087759077549, 0.009444444440305233, '00:16']
[1, 0.029521549120545387, 0.0178204495459795, 0.004999999888241291, '00:17']
[2, 0.012186083942651749, 0.013458597473800182, 0.006666666828095913, '00:17']
Ended: 2022-12-08 18:29:59.272703
Elapsed: 0:01:04.114477

I put the supression of the warnings in because somebody (I assume FastAI) is calling pytorch with deprecated arguments.

Some Examples

A Helper

def predict_category(path: str, learner) -> tuple:

    with learner.no_bar():
        prediction, probability_index, probabilities = learner.predict(
            PILImage.create(path))
    print(f"This is a {prediction}.")
    print(f"Probability it's a {prediction}: {float(probabilities[int(probability_index)]):.2f}")
    return prediction, probability_index, probabilities

predict = partial(predict_category, learner=learner)

A Tanuki

Let's look at the output of the learner.predict method when we pass the model the picture of a raccoon dog that we looked at when we were looking at the duckduckgo search example.

TANUKI_PATH = "/tmp/tanuki_image.jpg"
urlsave(url=TANUKI_URLS[0], dest=TANUKI_PATH)
prediction, probability_index, probabilities = predict(TANUKI_PATH)

This is a raccoon.
Probability it's a raccoon: 0.99

Prediction

The prediction returned by learner.predict is a string version of whatever your labeling function (parent_label in this case) returns.

print(prediction)

raccoon

In this case it thinks it's a raccoon, not a raccoon dog, so our model probably isn't ready for prime-time, but let's look at rest of the output anyway.

Probabilities

The probabilities is a TensorBase which, for our purposes, acts like a list of the probabilities that our image belongs to one of the classifications.

print(probabilities)

TensorBase([0.9894, 0.0106])

There are two probabilities because we have two classifications (raccoon and tanuki). When I first encountered fastai one of the things I couldn't figure out is which probability matches which classification. To figure that out you need our next value, the probability_index.

Probability Index

The probability_index tells you which one of the probabilities matches the predicted classification.

print(probability_index)

TensorBase(0)

Our model predicted that the image was a raccon and since the probability index is 0, the "raccoon" category matches the first entry in the probabilities collection, and looking back at the probabilities this means that the model is 99% sure that this is a raccoon.

Now, a Raccoon

RACCOON_PATH = "/tmp/raccoon_image.jpg"
urlsave(url=RACCOON_URLS[0], dest=RACCOON_PATH)
prediction, probability_index, probabilities = predict(RACCOON_PATH)

This is a raccoon.
Probability it's a raccoon: 1.00

It's really sure this is a raccoon.

And Then, the End

The final loss for the model during training was pretty low (less than 1%) but it wasn't able to identify our one tanuki test image. On the one hand, less than 1% loss isn't 0 loss, so I might just have chosen one example that is particularly hard. It might also be important that tanuki and raccoons do look quite a bit alike, so this is a harder problem than, say, cats versus dogs. Also, our method for gathering images isn't checking that the images are unique (although the URLs are, they might be redundant postings), and tanuki might be obscure enough that there aren't a huge variety of images out there, making it harder for the model to train to identify them.

FastAI QuickStart Tabular Data

Cloistered Monkey

2022-11-04 17:27

The Beginning

Imports

# python
from functools import partial

# fastai
from fastai.tabular.all import (
    Categorify,
    FillMissing,
    Normalize,
    TabularDataLoaders,
    URLs,
    accuracy,
    tabular_learner,
    untar_data,
)

# pypy
from tabulate import tabulate

import numpy
import pandas

# my stuff
from graeae import Timer

table = partial(tabulate, tablefmt="orgtbl", headers="keys")
TIMER = Timer()

The Middle

The Data

We're using the Adult Data Set, which has an unfortunate title but is a dataset built from 1994 census data to predict whether a person has an income greater than $50,000 a year.

path = untar_data(URLs.ADULT_SAMPLE)
DATA_PATH = path/"adult.csv"
data = pandas.read_csv(DATA_PATH)

numerical = data.select_dtypes(include=[numpy.number])
non_numerical = data.select_dtypes(exclude=[numpy.number])

print(table(numerical.describe()))

	age	fnlwgt	education-num	capital-gain	capital-loss	hours-per-week
count	32561	32561	32074	32561	32561	32561
mean	38.5816	189778	10.0798	1077.65	87.3038	40.4375
std	13.6404	105550	2.573	7385.29	402.96	12.3474
min	17	12285	1	0	0	1
25%	28	117827	9	0	0	40
50%	37	178356	10	0	0	40
75%	48	237051	12	0	0	45
max	90	1.4847e+06	16	99999	4356	99

print(table(non_numerical.describe()))

	workclass	education	marital-status	occupation	relationship	race	sex	native-country	salary
count	32561	32561	32561	32049	32561	32561	32561	32561	32561
unique	9	16	7	15	6	5	2	42	2
top	Private	HS-grad	Married-civ-spouse	Prof-specialty	Husband	White	Male	United-States	<50k
freq	22696	10501	14976	4073	13193	27816	21790	29170	24720

The column names don't really make clear what some things are, but since this is a quickstart I'll ignore their meaning but note that it was useful to split the data up by numeric and non-numeric types becaus when you build the TabularDataLoader you should specify the numeric and categorical column names. The fastai example only specifies some of the columns but I'll dump them all in and see what happens.

numeric_columns = numerical.columns.to_list()
categorical_columns = non_numerical.columns.to_list()[:-1]

The Data Loader

The original quickstart uses the TabularDataLoaders class to load batches of data for training, along with some pre-processing classes to encode the categorical data to make it numeric, fill in the missing values, and normalize the values so their ranges will match.

TARGET = "salary"

loader = TabularDataLoaders.from_csv(
    DATA_PATH, path=path, y_names=TARGET,
    cat_names = categorical_columns,
    cont_names = numeric_columns,
    procs = [Categorify, FillMissing, Normalize])

The Learner

learner = tabular_learner(loader, metrics=accuracy)

with learner.no_bar() as nobu, TIMER as tim:
    learner.fit_one_cycle(2)

Started: 2022-11-06 17:09:17.344678
[0, 0.37480291724205017, 0.35229262709617615, 0.8412162065505981, '00:02']
[1, 0.3569386303424835, 0.34605613350868225, 0.8421375751495361, '00:02']
Ended: 2022-11-06 17:09:23.994030
Elapsed: 0:00:06.649352

The Learned

Since the last column salary is the target we'll have to drop it before training the model on the data.

unsalaried = data.drop(["salary"], axis=1)

test_set = learner.dls.test_dl(unsalaried)

row, classifications, probabilities = learner.predict(
    data.iloc[0])

Sources

FastAI QuickStart: This is where I got the beginnings of the stuff here but it stops before showing you how to use the model you build.
FastAI Tabular Training page: This is where I got most of this stuffy.
StackOverflow answer on how to select pandas columns by data-type.

FastAI QuickStart

Cloistered Monkey

2022-11-03 19:40

I'm going to go through the fastai QuickStart to make sure I can get everything working before going too deep into the course. This is a post to point to the other posts that are related to the QuickStart.

FastAI Quickstart: Movie Review Sentiment

Cloistered Monkey

2022-11-02 19:49

The Beginning

The top post for the quickstart posts is this one and the previous post was on image segmentation.

Imports

# python
from pathlib import Path

# fastai
from fastai.text.all import (
    accuracy,
    AWD_LSTM,
    TextDataLoaders,
    text_classifier_learner,
    untar_data,
    URLs,
)

# monkey stuff
from graeae import Timer

TIMER = Timer()

The Model

Training the RNN

Note: Using untar_data for this dataset seems to fail with a FileNotFoundError. It's looking for a file ~/.fastai/data/imdb_tok/counter.pkl that isn't there. It seems to be something that other people have encountered as well (on the fastai forums, on github). This might need to be downloaded the old-fashioned way instead.

Update: untar_data seems to successfully download the imdb data but then something goes wrong with the imdb_tok folder. If you just delete it and pass in the path to the imdb folder (e.g. ~/.fastai/data/imdb/) to the data loader it seems to work.

path = untar_data(URLs.IMDB)

So here's the bit where we have to work around the untar_data error and pass in the path to the downloaded folder to the TextDataLoaders (why is this plural?).

Note: On the machine that I'm using the defaults for the transfer learning causes a CUDA Out of Memory Error. Following the advice in this github ticket I reduced the batch size to get it to work (the default is 64, 32 still crashed so I went for 16) so it takes a fairly long time to finish.

path = Path("~/.fastai/data/imdb/").expanduser()
BATCH_SIZE = 16

loader = TextDataLoaders.from_folder(path , valid='test', bs=BATCH_SIZE)
learner = text_classifier_learner(loader, AWD_LSTM, drop_mult=0.5, metrics=accuracy)

with learner.no_bar() as no_bar, TIMER as t:
    learner.fine_tune(2, 1e-2)

Started: 2022-11-03 18:19:52.039717
[0, 0.5076004862785339, 0.4184568226337433, 0.8115599751472473, '09:36']
[0, 0.2924654185771942, 0.21511425077915192, 0.9154800176620483, '14:35']
[1, 0.22890949249267578, 0.19540540874004364, 0.9239199757575989, '14:35']
Ended: 2022-11-03 18:58:39.453115
Elapsed: 0:38:47.413398

Testing it Out

with learner.no_bar():
    print(learner.predict("I really like this movie.")[2][1])
    print(learner.predict("I like this movie.")[2][1])
    print(learner.predict("I didn't like this movie.")[2][1])
    print(learner.predict("I hated this movie.")[2][1])
    print(learner.predict("I really hated this movie.")[2][1])

tensor(0.4845)
tensor(0.8964)
tensor(0.5823)
tensor(0.0839)
tensor(0.1197)

It appears to not think "really hated" is more positive than "hated", but more or less follows what you'd expect. But is this a general sentiment analyzer or only a movie sentiment analyzer?

with learner.no_bar():
    print(learner.predict("I really like this book.")[2][1])
    print(learner.predict("I like this car.")[2][1])
    print(learner.predict("I didn't like this lettuce.")[2][1])
    print(learner.predict("I hated this weather.")[2][1])
    print(learner.predict("I really hated this meeting.")[2][1])

tensor(0.9321)
tensor(0.7614)
tensor(0.2984)
tensor(0.0582)
tensor(0.3288)

It seems to pretty much follow the same pattern, although it was even more confused by really hating meetings.

with learner.no_bar():
    print(learner.predict("This lettuce is great.")[2][1])
    print(learner.predict("This lettuce is like butter.")[2][1])
    print(learner.predict("This lettuce was good enough.")[2][1])
    print(learner.predict("The lettuce was okay.")[2][1])
    print(learner.predict("I thought this lettuce wasn't very good.")[2][1])
    print(learner.predict("This lettuce is terrible.")[2][1])
    print(learner.predict("This lettuce was worse than spam.")[2][1])

tensor(0.9444)
tensor(0.2520)
tensor(0.6674)
tensor(0.3201)
tensor(0.4201)
tensor(0.0237)
tensor(0.0191)

End

I think as long as you pass in adjectives that it encountered in the reviews it works well enough. In a way it's more interesting to see how terms match up relative to each other (e.g. terrible isn't as bad as hated and okay is worse than wasn't very good) as it gives you a sense of how the reviewers use descriptive words in their reviews.

FastAI Quickstart: Segmentation

Cloistered Monkey

2022-11-02 18:20

Beginning

This is a look at the part of the fastai Quick Start that demonstrates image segmentation (Wikipedia Article). The goal here is to break images up into sub-parts.

The top post for the quickstart posts is this one and the previous post was the cat image identifier.

Imports

from fastai.vision.all import (
    SegmentationDataLoaders,
    SegmentationInterpretation,
    URLs,
    get_image_files,
    resnet34,
    unet_learner,
    untar_data,
    )

import numpy

Middle

Data, Model, Training

The dataset is a subset of the Cambridge-Driving Labeled Video Database (CamVid).

path = untar_data(URLs.CAMVID_TINY)

loader = SegmentationDataLoaders.from_label_func(
    path, bs=8, fnames=get_image_files(path/"images"),
    label_func = lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
    codes = numpy.loadtxt(path/'codes.txt', dtype=str)
)

learner = unet_learner(loader, resnet34)
with learner.no_bar():
    learner.fine_tune(8)

[0, 3.394050359725952, 2.541146755218506, '00:01']
[0, 1.87003755569458, 1.5329511165618896, '00:01']
[1, 1.567323088645935, 1.4149396419525146, '00:01']
[2, 1.3944206237792969, 1.1255743503570557, '00:01']
[3, 1.2387481927871704, 0.8764406442642212, '00:01']
[4, 1.1080732345581055, 0.8167174458503723, '00:01']
[5, 1.0044615268707275, 0.77195143699646, '00:01']
[6, 0.91986483335495, 0.7509599924087524, '00:01']
[7, 0.8545125722885132, 0.7430445551872253, '00:01']

Column	Label
0	epoch
1	train_loss
2	valid_loss
3	error_rate
4	time

with learner.no_bar():
    learner.show_results(max_n=6, figsize=(20, 30))

Note: The example ends with a plot of the images that had the greatest loss, but out of the box it doesn't work in this org-mode setup so I'll skip it for now, since I think it will be a bit of a slog figuring out how to get it working.

End

The top post for the quickstart posts is this one and the next post will be on sentiment analysis.

FastAI Cats and Dogs

Cloistered Monkey

2022-10-25 17:50

What Is This?

This is a run-through of the fastai Computer Vision Quickstart that shows how to build an image classification model from a public dataset hosted on fastai's site. It is similar to the post on classifying rabbits and pigs except in the other post we create our own dataset by searching duckduckgo for images.

Importing

# python standard library
from pathlib import Path

As noted on Stack Overflow, FastAI does a lot of monkey patching, so if you just import something from where it's defined (to make it clearer where things are coming from) it might not have the methods or attributes you expect. In this case, for instance, the vision_learner function is defined in fastai.vision.learner but if you try and import it from there the object you get back won't have the to_fp16 method that we're going to use so you have to import it from fastai.vision.all instead. Since there's no good way to avoid using all I'll import objects from there but I'll try and also point to the original modules where things are defined to make it easier to look things up.

Module	Import
fastai.data.external	untar_data, URLs
fastai.data.transforms	get_image_files
fastai.metrics	error_rate
fastai.vision.augment	Resize
fastai.vision.core	PILImage
fastai.vision.data	ImageDataLoaders
fastai.vision.learner	vision_learner
torchvision.models.resnet	resnet34

from fastai.vision.all import (
    ImageDataLoaders,
    PILImage,
    Resize, 
    URLs,
    error_rate,
    get_image_files,
    resnet34,
    untar_data,
    vision_learner,
)

Setting Up

This downloads the Oxford-IIIT Pet Dataset. Despite the name, there are only cats and dogs in the dataset (37 breeds across the species).

Function/Object	Description	Documentation Link
`untar_data`	Function to download fastai datasets/weights	External Data, function arguments
`URLs`	Constants for datasets	A brief description

By default this will download the data to ~/.fastai/data but both untar_data and URLs (note the s at the end is lowercase) take an argument c_key that allows changing this but I don't know what the difference is between using one or the other.

path = untar_data(URLs.PETS)/"images"
print(path)

/home/athena/.fastai/data/oxford-iiit-pet/images

The names of the files give the breed of the pet (either cat or dog) with dog names all in lower case (e.g. "yorkshire_terrire_9.jpg") and cats with the first initials capitalized (e.g. "Abyssinian_100.jpg"). So our function to categorize the training data will check if the first letter is a capital letter and label it True if it is, False if it isn't, using the following function.

def its_a_cat(filename: str) -> bool:
    """Decide if file is a picture of a cat

    Args:
     filename: name of file where first letter is capitalized if it's a cat

    Returns:
     True if first letter is capitalized (so it's a picture of a cat)
    """
    return filename[0].isupper()

This next bit creates a batch data loader for us.

Object	Description	Documentation
`ImageDataLoaders`	Data loader with functions for images.	ImageDataLoaders, from_name_func
`get_image_files`	Recursively retrieve images from folders.	docstring
`Resize`	Resize each image (if you pass in one size it uses it for all dimensions).	docstring

loader = ImageDataLoaders.from_name_func(
    path,
    get_image_files(path),
    valid_pct=0.2,
    seed=42,
    label_func=its_a_cat,
    item_tfms=Resize(224)
)

Now we create the model that learns to detect cats.

Object	Description	Documentation
`vision_learner`	Builds a model for transfer learning.	Arguments
`resnet34`	Residual Network model	torchvision documentation
`error_rate`	1 - accuracy (the fraction that was incorrect)	arguments
`to_fp16`	Use 16-bit (half-precision) floats	Mixed Precision Training Explained

learner = vision_learner(
    loader, resnet34, metrics=error_rate)

cat_model = learner.to_fp16()

Pretty much all of this is inexplicable if you haven't used some kind of neural network library before, but that last call (``to_fp16``) seems especially mysterious. This first part is just about making sure things work, though, so I'll wait until I get to the more detailed explanations to figure it out, although their article "Mixed Precision Training Explained" explains it pretty well.

Train It

We're using a pre-trained model so we just have to do some transfer learning - freezing the weights of most of the layers and training the last layer to make a cat or not a cat classification.

For some reason fastai assumes that you'll only run it in a jupyter notebook and dumps out a progress bar with no simple way to disable it permanently. As a workaround I'll use the context-manager no_bar to turn off the progress bar temporarily.

Method	Description	Documentation
`fine_tune`	Does transfer learning (presumably)	None found, but here's the signatures for the freeze and unfreeze methods
`no_bar`	Turn off the progress bar.	docstring

with cat_model.no_bar():
    cat_model.fine_tune(1)

[0, 0.17085878551006317, 0.019044965505599976, 0.005412719678133726, '00:20']
[0, 0.05584857985377312, 0.01942548155784607, 0.0067658997140824795, '00:25']

Fastai really seems to want to force you to use their system the way they do - the output from fine_tune is printed to standard out and not returned as some kind of object so I can't re-format it to make it nicer looking here (using org-mode), but for reference, the columns for the two rows of output are:

epoch
train_loss
valid_loss
error_rate
time

Given these labels, the output of the last block shows that the error rate for the second epoch was 0.005, and it took about twenty and twenty-five seconds per epoch.

Some Test Images

We're going to apply our model to some images of cats and a dog to see what it tells us about the image. Since it's the same process for each image I'll create a function check_image to handle it.

Object	Description	Documentation
`PILImage`	Object to represent images.	docstring
`create`	Load the image as PILImage	load_image, PILBase (follow source link to see definition of `create`)

def check_image(path: str) -> None:
    """Loads the image and checks if it's a cat

    Args:
     path: string with path to the image
    """
    POSITIVE, NEGATIVE = " think", " don't think"

    image = PILImage.create(Path(path).expanduser())

    with cat_model.no_bar():
        ees_cat, _, probablilities = cat_model.predict(image)
    print(f"I{POSITIVE if ees_cat=='True' else NEGATIVE} this is a cat.")
    print(f"The probability that it's a cat is {probablilities[1].item():.2f}")
    return

A Cat

Here's our first test image.

As you can see, it appears to be ridden with parasites, causing it to scratch uncontrollably (the toxoplasma isn't visible but assumed) -let's see how our classifier does at guessing that it's a cat.

check_image("~/test-cat.jpg")

I think this is a cat.
The probability that it's a cat is 1.00

So, it's pretty sure that this is a cat.

A Negative Test Image

We could try any image, but for now, since the dataset used dogs and cats, let's see if it thinks a dog is a cat.

check_image("~/test-dog.jpg")

I don't think this is a cat.
The probability that it's a cat is 0.00

It's sure that this isn't a cat.

A Strange Cat

I tried to find images of cats that looked like dogs or vice-versa, but it turns out that they're pretty different looking things, so let's just try an unusual looking cat.

check_image("~/elf-cat.jpg")

I think this is a cat.
The probability that it's a cat is 1.00

The End

So there you go, not really exciting, which I suppose is sort of the point of fastai - it should be simple, almost boring, to do image classification. This is just a rehash of what they did, of course, a better check would be to try something different, but since this is the first take it'll have to do for now.

The top post for the quickstart posts is this one and the next post will be on Image Segmentation.

Sources

Fast AI

The Quickstart

Table of Contents

What is this about?

Requirements

The Neurotic Repository

The Testing Dependencies

The Configuration File

The Header

The Pytest BDD Features Base

The Regular Testing Run

The Run-Once

The PUDB Run

Links

Table of Contents

DuckDuckGo Image Search

The Parameters

The Output

Under The Hood

The VQD

Python Translation

Sources

Table of Contents

In the Beginning

Imports and Setup

The Validation Data

The Benchmark Data

Build The Base Chart

Plot All the Architectures

Plot Some of the Architectures

Inference

A Subset

Training

Parameters Vs Time

Accuracy Vs Size

Sources

Table of Contents

Redoing The Cats and Dogs

Saving the Model

Loading the Model

Sources

Table of Contents

What Is This?

Imports

Getting the Images

A Test Of Tanuki

Build A Data Set

Verify the Dataset

Training the Model

Some Examples

A Helper

A Tanuki

Prediction

Probabilities

Probability Index

Now, a Raccoon

And Then, the End

Table of Contents

The Beginning

Imports

The Middle

The Data

The Data Loader

The Learner

The Learned

Sources

Table of Contents

The Beginning

Imports

The Model

Training the RNN

Testing it Out

End

Table of Contents

Beginning

Imports

Middle

Data, Model, Training

End

Table of Contents

What Is This?

Importing