Introduction To Altair

Setup

# python
from functools import partial
from pathlib import Path
import os

# pypi
from dotenv import load_dotenv
from expects import be_true, expect
from tabulate import tabulate

import altair
import pandas

# monkey
from graeae.visualization.altair_helpers import output_path, save_chart
SLUG = "introduction-to-altair"
OUTPUT_PATH = output_path(SLUG)

HEIGHT, WIDTH = 600, 800
SAVE_IT = partial(save_chart, output_path=OUTPUT_PATH, height=HEIGHT + 100)
TABLE = partial(partial(tabulate,
                        headers="keys",
                        tablefmt="orgtbl",
                        showindex=False))

The Data

load_dotenv(override=True)
table_path = Path(os.getenv("WORLD_HAPPINESS_TABLE"))

expect(table_path.is_file()).to(be_true)

table = pandas.read_csv(table_path)
print(table.shape)
(2199, 11)
def column_printer(table, headers=("Column", "Type")):
    print(TABLE(
        ((column, str(table[column].dtype))
         for column in table.columns),
        headers=headers))
    return
column_printer(table)
Column Type
Country name object
year int64
Life Ladder float64
Log GDP per capita float64
Social support float64
Healthy life expectancy at birth float64
Freedom to make life choices float64
Generosity float64
Perceptions of corruption float64
Positive affect float64
Negative affect float64
class Column:
    __slots__ = ()
    country = "Country name"
    year = "year"
    happiness = "Life Ladder"
    gdp = "Log GDP per capita"
    generosity = "Generosity"

Creating a Chart

chart = altair.Chart(table)
print(type(chart))
<class 'altair.vegalite.v5.api.Chart'>

The Chart class is defined in altair.vegalite.v5.api. This is its docstring description:

Create a basic Altair/Vega-Lite chart.

Although it is possible to set all Chart properties as constructor attributes, it is more idiomatic to use methods such as mark_point(), encode(), transform_filter(), properties(), etc. See Altair's documentation for details and examples: http://altair-viz.github.io/.

The attributes set by the Chart class' constructor (as opposed to being passed to its parent classes) are:

  • data
  • encoding
  • mark
  • width
  • height

By default they're set to Undefined which is an altair-defined object (see altair.utils.schemapi), and as noted, you don't normally set the attributes using the constructor (other than data which isn't mentioned in the docstring but appears to be passed to the Chart constructor by convention).

Chart Class

The methods take arguments, but I believe interactive (which I haven't seen called with arguments) is the only method that I'll be using - all the other methods you use belong to parent classes.

TopLevelUnitSpec

nil

Only the top-most parent (SchemaBase) has a method of interest here.

SchemaBase

nil

There are many more methods in altair.utils.schemapi.SchemaBase but I'm highlighting copy here because it gets used quite a bit by the other classes but is defined in this somewhat obscure place. The behavior is what you'd expect so I don't see a need to go over it, but it's one of those mystery methods that just pops up when you use deep inheritance like this that makes you wonder what's going on so I'll document it here, for now.

note: The details of the code probably shouldn't go into the introduction-introduction. Maybe put this stuff further down or in another post.

_EncodingMixin

nil

The encoding method takes in whatever combination of positional and keyword arguments you pass into it and uses them along with the values that are already set in the Chart's encoding attribute to update the encoding before returning the chart (a copy so the original chart isn't changed).

The encoding attribute is an instance of the altair.vegalite.v5.schema.FacetedEncoding class.

Making It a Bar Chart

bar_chart = chart.mark_bar()
print(type(bar_chart))
<class 'altair.vegalite.v5.api.Chart'>

MarkMixin Class

Looking in the mark_bar method, there's a lot of arguments you could pass to it, but fundamentally all it's really doing is setting the mark attribute to bar.

print("Chart Mark: '{}'".format(chart.mark))
print("Bar Chart Mark: '{}'".format(bar_chart.mark))
Chart Mark: 'Undefined'
Bart Chart Mark: 'bar'

Year

year_counts = table.year.value_counts().reset_index().sort_values("year").T
year_counts.columns = year_counts.iloc[0]
year_counts = year_counts.drop(year_counts.index[0])
print(TABLE(year_counts, showindex=True))
  2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
count 27 89 102 110 114 124 146 141 136 144 142 141 147 141 143 116 122 114

Now as a bar-chart.

year_chart = bar_chart.encode(
    x="year:N",
    y="count()")

year_chart = year_chart.properties(height=HEIGHT, width=WIDTH)

SAVE_IT(year_chart, "year-counts-bar-chart")

Figure Missing

nil

This method makes a copy of the class, validates the arguments and then sets the properties on itself. The arguments are based on the JSON Schema passed to vega-lite.

Life Ladder

By Year

boxplot = chart.mark_boxplot(extent="min-max").encode(
    x="{}:O".format(Column.year),
    y=Column.happiness
).properties(height=HEIGHT, width=WIDTH)

SAVE_IT(boxplot, "happiness-year-boxplots")

Figure Missing

The Most Recent Year (2022)

print(table.year.dtype)
int64
data_2022 = table[table.year==2022]
chart_2022 = altair.Chart(data_2022).properties(
    height=HEIGHT, width=WIDTH)

bar_chart_2022 = chart_2022.mark_bar()

ladder_chart = bar_chart_2022.encode(
    x=altair.X(Column.happiness, bin=True),
    y="count()"
)

SAVE_IT(ladder_chart, "ladder-histogram")

Figure Missing

GDP

scatter = chart_2022.mark_circle()
print(scatter.mark)
circle
gdp_scatter = scatter.encode(
    x=Column.gdp,
    y=Column.happiness
)

SAVE_IT(gdp_scatter, "gdp-vs-happiness")

Figure Missing

With Generosity

gdp_generosity = scatter.encode(
    x=Column.happiness,
    y=Column.generosity,
    color=Column.gdp,
    tooltip=[Column.country, Column.happiness, Column.gdp,
             Column.generosity]
)

SAVE_IT(gdp_generosity, "gdp-and-generosity")

Figure Missing

Links

Tutorial Sources

The Data

Altair