POS Tagging: Checking the Accuracy of Model

Predicting on a data set

Compute the accuracy of your prediction by comparing it with the true `y` labels.

  • `pred` is a list of predicted POS tags corresponding to the words of the `test_corpus`.

Imports

# python
import math

# pypi
from dotenv import load_dotenv

# this stuff
from neurotic.nlp.parts_of_speech import DataLoader, HiddenMarkov, Matrices, TheTrainer

# some other stuff
from graeae import Timer

Set Up

The Timer

TIMER = Timer()

The Matrices

with TIMER:
    load_dotenv("posts/nlp/.env")
    loader = DataLoader()
    trainer = TheTrainer(loader.processed_training)
    matrices = Matrices(transition_counts=trainer.transition_counts,
                        emission_counts=trainer.emission_counts,
                        words=loader.vocabulary_words,
                        tag_counts=trainer.tag_counts,
                        alpha=0.001)
    model = HiddenMarkov(loader, trainer, matrices)
    model()
2020-11-30 20:41:51,497 graeae.timers.timer start: Started: 2020-11-30 20:41:51.497226
2020-11-30 20:43:32,311 graeae.timers.timer end: Ended: 2020-11-30 20:43:32.311608
2020-11-30 20:43:32,312 graeae.timers.timer end: Elapsed: 0:01:40.814382

These classes were defined in other posts:

Middle

Aliases

The original notebooks use a naming scheme that I don't really like so have to be aliased to make my stuff work with theirs.

prep = loader.test_words
pred = model.predictions

raw = loader.test_data_tuples
missing = [index for index, pair in enumerate(raw) if not pair]
for index in missing:
    raw[index] = tuple(("", "--n--"))

y = [label for word, label in raw]

print('The third word is:', prep[3])
print('Your prediction is:', pred[3])
print('Your corresponding label y is: ', y[3])

assert len(y) == len(prep)
assert len(y) == len(pred)
The third word is: temperature
Your prediction is: NN
Your corresponding label y is:  NN

Redo Y

Now that I look at their code, they expect the "y" list to be the un-split strings. Ugh.

y = loader.test_data_raw
print('Your corresponding label y is: ', y[3])
Your corresponding label y is:  temperature     NN

Compute Accuracy

def compute_accuracy(predlist, y: list) -> float:
    """Calculate the accuracy of our model's predictions

    Args: 
      pred: a list of the predicted parts-of-speech 
      y: a list of lines where each word is separated by a '\t' (i.e. word \t tag)
    Returns: 
      accuracy of the predictions
    """
    num_correct = 0
    total = 0

    # Zip together the prediction and the labels
    for prediction, y in zip(pred, y):
        # Split the label into the word and the POS tag
        word_tag_tuple = y.split()

        # Check that there is actually a word and a tag
        # no more and no less than 2 items
        if len(word_tag_tuple) != 2: # complete this line
            continue 

        # store the word and tag separately
        word, tag = word_tag_tuple

        # Check if the POS tag label matches the prediction
        if tag == prediction: # complete this line

            # count the number of times that the prediction
            # and label match
            num_correct += 1

        # keep track of the total number of examples (that have valid labels)
        total += 1

    return num_correct/total
accuracy = compute_accuracy(pred, y)
print(f"Accuracy of the Viterbi algorithm is {accuracy:.4f}")
assert math.isclose(accuracy, 0.95, abs_tol=1e-2)
Accuracy of the Viterbi algorithm is 0.9545

Note: The original notebook accuracy was 0.9531. I don't really know what caused the difference - I suspect their preprocessing - but since this is better I'll keep it.

End

So, there you go, parts-of-speech tagging.