Sentiment Analysis: Training the Model

Training the Model

In the previous post we defined our Deep Learning model for Sentiment Analysis. Now we'll turn to training it on our data.

To train a model on a task, Trax defines an abstraction which packages the training data, loss and optimizer (among other things) together into an object.

Similarly to training a model, Trax defines an abstraction which packages the eval data and metrics (among other things) into another object.

The final piece tying things together is the abstraction that is a very simpl eand flexible way to put everything together and train the model, all the while evaluating it and saving checkpoints. Using Loop will save you a lot of code compared to always writing the training loop by hand, like you did in courses 1 and 2. More importantly, you are less likely to have a bug in that code that would ruin your training.


# from python
from functools import partial
from pathlib import Path

import random

# from pypi
from trax.supervised import training

import nltk
import trax
import trax.layers as trax_layers
import trax.fastmath.numpy as numpy

# this project
from neurotic.nlp.twitter.tensor_generator import TensorBuilder, TensorGenerator

This next part (re-downloading the dataset) is just because I have to keep setting up new containers to get trax to work…"twitter_samples", download_dir="/home/neurotic/data/datasets/nltk_data/")


The Dataset


converter = TensorBuilder()

train_generator = partial(TensorGenerator, converter,
training_generator = train_generator()

valid_generator = partial(TensorGenerator,
validation_generator = valid_generator()

size_of_vocabulary = len(converter.vocabulary)

Here's the Model

This was defined in the last post. It seems like too much trouble not to just copy it over.

def classifier(vocab_size: int=size_of_vocabulary,
               embedding_dim: int=256,
               output_dim: int=2) -> trax_layers.Serial:
    """Creates the classifier model

     vocab_size: number of tokens in the training vocabulary
     embedding_dim: output dimension for the Embedding layer
     output_dim: dimension for the Dense layer

     the composed layer-model
    embed_layer = trax_layers.Embedding(
        vocab_size=vocab_size, # Size of the vocabulary
        d_feature=embedding_dim)  # Embedding dimension

    mean_layer = trax_layers.Mean(axis=1)

    dense_output_layer = trax_layers.Dense(n_units = output_dim)

    log_softmax_layer = trax_layers.LogSoftmax()

    model = trax_layers.Serial(
    return model

Now to train the model.

First define the TrainTask, EvalTask and Loop in preparation to training the model.


# train_generator(batch_size=batch_size, shuffle=True),

train_task = training.TrainTask(

eval_task = training.EvalTask(
    metrics=[trax_layers.CrossEntropyLoss(), trax_layers.Accuracy()],

model = classifier()

This defines a model trained using tl.CrossEntropyLoss optimized with the trax.optimizers.Adam optimizer, all the while tracking the accuracy using tl.Accuracy metric. We also track tl.CrossEntropyLoss on the validation set.

Now let's make an output directory and train the model.

output_path = Path("~/models/").expanduser()
if not output_path.is_dir():
def train_model(classifier, train_task, eval_task, n_steps, output_dir):
    """Create and run the training loop

       classifier - the model you are building
       train_task - Training task
       eval_task - Evaluation task
       n_steps - the evaluation steps
       output_dir - folder to save your files
       trainer -  trax trainer
    training_loop = training.Loop(
                                model=classifier, # The learning model
                                tasks=train_task, # The training task
                                eval_tasks = eval_task, # The evaluation task
                                output_dir = output_dir) # The output directory = n_steps)
    # Return the training_loop, since it has the model.
    return training_loop
training_loop = train_model(model, train_task, eval_task, 100, output_path)

Step    110: Ran 10 train steps in 6.06 secs
Step    110: train CrossEntropyLoss |  0.00527583
Step    110: eval  CrossEntropyLoss |  0.00304692
Step    110: eval          Accuracy |  1.00000000

Step    120: Ran 10 train steps in 2.06 secs
Step    120: train CrossEntropyLoss |  0.02130376
Step    120: eval  CrossEntropyLoss |  0.00000677
Step    120: eval          Accuracy |  1.00000000

Step    130: Ran 10 train steps in 0.75 secs
Step    130: train CrossEntropyLoss |  0.01026674
Step    130: eval  CrossEntropyLoss |  0.00424393
Step    130: eval          Accuracy |  1.00000000

Step    140: Ran 10 train steps in 1.33 secs
Step    140: train CrossEntropyLoss |  0.00172522
Step    140: eval  CrossEntropyLoss |  0.00004072
Step    140: eval          Accuracy |  1.00000000

Step    150: Ran 10 train steps in 0.77 secs
Step    150: train CrossEntropyLoss |  0.00002847
Step    150: eval  CrossEntropyLoss |  0.00000232
Step    150: eval          Accuracy |  1.00000000

Step    160: Ran 10 train steps in 0.78 secs
Step    160: train CrossEntropyLoss |  0.00002123
Step    160: eval  CrossEntropyLoss |  0.00104654
Step    160: eval          Accuracy |  1.00000000

Step    170: Ran 10 train steps in 0.79 secs
Step    170: train CrossEntropyLoss |  0.00001706
Step    170: eval  CrossEntropyLoss |  0.00000080
Step    170: eval          Accuracy |  1.00000000

Step    180: Ran 10 train steps in 0.83 secs
Step    180: train CrossEntropyLoss |  0.00001554
Step    180: eval  CrossEntropyLoss |  0.00000989
Step    180: eval          Accuracy |  1.00000000

Step    190: Ran 10 train steps in 0.85 secs
Step    190: train CrossEntropyLoss |  0.00639312
Step    190: eval  CrossEntropyLoss |  0.00255337
Step    190: eval          Accuracy |  1.00000000

Step    200: Ran 10 train steps in 0.85 secs
Step    200: train CrossEntropyLoss |  0.00124322
Step    200: eval  CrossEntropyLoss |  0.02190475
Step    200: eval          Accuracy |  1.00000000

Bundle It Up









# python
from pathlib import Path

# from pypi
from trax.supervised import training

import attr
import trax
import trax.layers as trax_layers

The Trainer

class SentimentNetwork:
    """Builds and Trains the Sentiment Analysis Model

     training_generator: generator of training batches
     validation_generator: generator of validation batches
     vocabulary_size: number of tokens in the training vocabulary
     training_loops: number of times to run the training loop
     output_path: path to where to store the model
     embedding_dimension: output dimension for the Embedding layer
     output_dimension: dimension for the Dense layer
    vocabulary_size: int
    training_generator: object
    validation_generator: object
    training_loops: int
    output_path: Path
    embedding_dimension: int=256
    output_dimension: int=2
    _model: trax_layers.Serial=None
    _training_task: training.TrainTask=None
    _evaluation_task: training.EvalTask=None
    _training_loop: training.Loop=None
  • The Model
    def model(self) -> trax_layers.Serial:
        """The Embeddings model"""
        if self._model is None:
            self._model = trax_layers.Serial(
        return self._model
  • The Training Task
    def training_task(self) -> training.TrainTask:
        """The training task for training the model"""
        if self._training_task is None:
            self._training_task = training.TrainTask(
        return self._training_task
  • Evaluation Task
    def evaluation_task(self) -> training.EvalTask:
        """The validation evaluation task"""
        if self._evaluation_task is None:
            self._evaluation_task = training.EvalTask(
        return self._evaluation_task
  • Training Loop
    def training_loop(self) -> training.Loop:
        """The thing to run the training"""
        if self._training_loop is None:
            self._training_loop = training.Loop(
                output_dir= self.output_path) 
        return self._training_loop
  • Fitting the Model
    def fit(self):
        """Runs the training loop"""

Practice In Making Predictions

Now that you have trained a model, you can access it as training_loop.model object. We will actually use training_loop.eval_model and in the next weeks you will learn why we sometimes use a different model for evaluation, e.g., one without dropout. For now, make predictions with your model.

Use the training data just to see how the prediction process works.

  • Later, you will use validation data to evaluate your model's performance.

Create a generator object.

tmp_train_generator = train_generator(batch_size=16)

Get one batch.

tmp_batch = next(tmp_train_generator)

Position 0 has the model inputs (tweets as tensors). Position 1 has the targets (the actual labels).

tmp_inputs, tmp_targets, tmp_example_weights = tmp_batch

print(f"The batch is a tuple of length {len(tmp_batch)} because position 0 contains the tweets, and position 1 contains the targets.") 
print(f"The shape of the tweet tensors is {tmp_inputs.shape} (num of examples, length of tweet tensors)")
print(f"The shape of the labels is {tmp_targets.shape}, which is the batch size.")
print(f"The shape of the example_weights is {tmp_example_weights.shape}, which is the same as inputs/targets size.")
The batch is a tuple of length 3 because position 0 contains the tweets, and position 1 contains the targets.
The shape of the tweet tensors is (16, 14) (num of examples, length of tweet tensors)
The shape of the labels is (16,), which is the batch size.
The shape of the example_weights is (16,), which is the same as inputs/targets size.

Feed the tweet tensors into the model to get a prediction.

tmp_pred = training_loop.eval_model(tmp_inputs)
print(f"The prediction shape is {tmp_pred.shape}, num of tensor_tweets as rows")
print("Column 0 is the probability of a negative sentiment (class 0)")
print("Column 1 is the probability of a positive sentiment (class 1)")
print("View the prediction array")
The prediction shape is (16, 2), num of tensor_tweets as rows
Column 0 is the probability of a negative sentiment (class 0)
Column 1 is the probability of a positive sentiment (class 1)

View the prediction array
[[-1.2960873e+01 -2.3841858e-06]
 [-5.6474457e+00 -3.5326481e-03]
 [-5.3460855e+00 -4.7781467e-03]
 [-7.6736917e+00 -4.6515465e-04]
 [-5.2682662e+00 -5.1658154e-03]
 [-1.0566207e+01 -2.5749207e-05]
 [-5.6388092e+00 -3.5634041e-03]
 [-3.9540453e+00 -1.9363165e-02]
 [ 0.0000000e+00 -2.0700916e+01]
 [ 0.0000000e+00 -2.2949795e+01]
 [ 0.0000000e+00 -2.3168846e+01]
 [ 0.0000000e+00 -2.4553205e+01]
 [-9.5367432e-07 -1.3878939e+01]
 [ 0.0000000e+00 -1.6655178e+01]
 [ 0.0000000e+00 -1.5975946e+01]
 [ 0.0000000e+00 -2.0577690e+01]]

To turn these probabilities into categories (negative or positive sentiment prediction), for each row:

  • Compare the probabilities in each column.
  • If column 1 has a value greater than column 0, classify that as a positive tweet.
  • Otherwise if column 1 is less than or equal to column 0, classify that example as a negative tweet.

Turn probabilites into category predictions.

tmp_is_positive = tmp_pred[:,1] > tmp_pred[:,0]
for i, p in enumerate(tmp_is_positive):
    print(f"Neg log prob {tmp_pred[i,0]:.4f}\tPos log prob {tmp_pred[i,1]:.4f}\t is positive? {p}\t actual {tmp_targets[i]}")
Neg log prob -12.9609   Pos log prob -0.0000     is positive? True       actual 1
Neg log prob -5.6474    Pos log prob -0.0035     is positive? True       actual 1
Neg log prob -5.3461    Pos log prob -0.0048     is positive? True       actual 1
Neg log prob -7.6737    Pos log prob -0.0005     is positive? True       actual 1
Neg log prob -5.2683    Pos log prob -0.0052     is positive? True       actual 1
Neg log prob -10.5662   Pos log prob -0.0000     is positive? True       actual 1
Neg log prob -5.6388    Pos log prob -0.0036     is positive? True       actual 1
Neg log prob -3.9540    Pos log prob -0.0194     is positive? True       actual 1
Neg log prob 0.0000     Pos log prob -20.7009    is positive? False      actual 0
Neg log prob 0.0000     Pos log prob -22.9498    is positive? False      actual 0
Neg log prob 0.0000     Pos log prob -23.1688    is positive? False      actual 0
Neg log prob 0.0000     Pos log prob -24.5532    is positive? False      actual 0
Neg log prob -0.0000    Pos log prob -13.8789    is positive? False      actual 0
Neg log prob 0.0000     Pos log prob -16.6552    is positive? False      actual 0
Neg log prob 0.0000     Pos log prob -15.9759    is positive? False      actual 0
Neg log prob 0.0000     Pos log prob -20.5777    is positive? False      actual 0

Notice that since you are making a prediction using a training batch, it's more likely that the model's predictions match the actual targets (labels).

  • Every prediction that the tweet is positive is also matching the actual target of 1 (positive sentiment).
  • Similarly, all predictions that the sentiment is not positive matches the actual target of 0 (negative sentiment)

One more useful thing to know is how to compare if the prediction is matching the actual target (label).

  • The result of calculation is_positive is a boolean.
  • The target is a type trax.fastmath.numpy.int32
  • If you expect to be doing division, you may prefer to work with decimal numbers with the data type type trax.fastmath.numpy.int32

View the array of booleans.

print("Array of booleans")
Array of booleans
DeviceArray([ True,  True,  True,  True,  True,  True,  True,  True,
             False, False, False, False, False, False, False, False],            dtype=bool)

Convert booleans to type int32.

  • True is converted to 1
  • False is converted to 0
tmp_is_positive_int = tmp_is_positive.astype(trax.fastmath.numpy.int32)

View the array of integers.

print("Array of integers")
Array of integers
DeviceArray([1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

Convert boolean to type float32.

tmp_is_positive_float = tmp_is_positive.astype(numpy.float32)

View the array of floats.

print("Array of floats")
Array of floats
DeviceArray([1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0.,
             0.], dtype=float32)
(16, 2)

Note that Python usually does type conversion for you when you compare a boolean to an integer.

  • True compared to 1 is True, otherwise any other integer is False.
  • False compared to 0 is True, otherwise any ohter integer is False.
print(f"True == 1: {True == 1}")
print(f"True == 2: {True == 2}")
print(f"False == 0: {False == 0}")
print(f"False == 2: {False == 2}")
True == 1: True
True == 2: False
False == 0: True
False == 2: False

However, we recommend that you keep track of the data type of your variables to avoid unexpected outcomes. So it helps to convert the booleans into integers.

Compare 1 to 1 rather than comparing True to 1.

Hopefully you are now familiar with what kinds of inputs and outputs the model uses when making a prediction.

  • This will help you implement a function that estimates the accuracy of the model's predictions.


5.1 Computing the accuracy of a batch

You will now write a function that evaluates your model on the validation set and returns the accuracy.

  • preds contains the predictions.
  • Its dimensions are (batch_size, output_dim). output_dim is two in this case. Column 0 contains the probability that the tweet belongs to class 0 (negative sentiment). Column 1 contains probability that it belongs to class 1 (positive sentiment).
  • If the probability in column 1 is greater than the probability in column 0, then interpret this as the model's prediction that the example has label 1 (positive sentiment).
  • Otherwise, if the probabilities are equal or the probability in column 0 is higher, the model's prediction is 0 (negative sentiment).
  • y contains the actual labels.
  • y_weights contains the weights to give to predictions.
def compute_accuracy(preds: numpy.ndarray,
                     y: numpy.ndarray,
                     y_weights: numpy.ndarray) -> tuple:
    """Compute a batch accuracy

       preds: a tensor of shape (dim_batch, output_dim) 
       y: a tensor of shape (dim_batch,) with the true labels
       y_weights: a n.ndarray with the a weight for each example

       accuracy: a float between 0-1 
       weighted_num_correct (np.float32): Sum of the weighted correct predictions
       sum_weights (np.float32): Sum of the weights
    # Create an array of booleans, 
    # True if the probability of positive sentiment is greater than
    # the probability of negative sentiment
    # else False
    is_pos =  preds[:, 1] > preds[:, 0]

    # convert the array of booleans into an array of np.int32
    is_pos_int = is_pos.astype(numpy.int32)

    # compare the array of predictions (as int32) with the target (labels) of type int32
    correct = is_pos_int == y

    # Count the sum of the weights.
    sum_weights = y_weights.sum()

    # convert the array of correct predictions (boolean) into an arrayof np.float32
    correct_float = correct.astype(numpy.float32)

    # Multiply each prediction with its corresponding weight.
    weighted_correct_float =

    # Sum up the weighted correct predictions (of type np.float32), to go in the
    # denominator.
    weighted_num_correct = weighted_correct_float.sum()

    # Divide the number of weighted correct predictions by the sum of the
    # weights.
    accuracy = weighted_num_correct/sum_weights

    return accuracy, weighted_num_correct, sum_weights

Get one batch.

tmp_val_generator = valid_generator(batch_size=64)
tmp_batch = next(tmp_val_generator)

Position 0 has the model inputs (tweets as tensors) position 1 has the targets (the actual labels)

tmp_inputs, tmp_targets, tmp_example_weights = tmp_batch

Feed the tweet tensors into the model to get a prediction.

tmp_pred = training_loop.eval_model(tmp_inputs)
tmp_acc, tmp_num_correct, tmp_num_predictions = compute_accuracy(preds=tmp_pred, y=tmp_targets, y_weights=tmp_example_weights)

print(f"Model's prediction accuracy on a single training batch is: {100 * tmp_acc}%")
print(f"Weighted number of correct predictions {tmp_num_correct}; weighted number of total observations predicted {tmp_num_predictions}")
Model's prediction accuracy on a single training batch is: 100.0%
Weighted number of correct predictions 64.0; weighted number of total observations predicted 64


Now that we have a trained model, in the next post we'll test how well it did.