Bokeh Test
The Plot
What This Is
This is a re-do of the final plot done for data-science with python course 2 week 4. The original was done with matplotlib and this was done with bokeh to get some interaction working. When I tried to create it the first time bokeh raised some errors saying that height had been defined more than once. I don't know what really caused it - possibly a namespace clash where I was re-using something I didn't intend to re-use - but when I created a new notebook that only created the one plot it worked. Since this uses javascript I used Jupyter and the web-inteface to test it out (emacs ipython doesn't seem to be able to render javascript (unless I'm doing it wrong)).
How It Got Exported
I won't go over the creating of the data (since I just copied it from an earlier notebook) but this is how the bokeh plot was created.
Imports
These were the bokeh parts I needed.
from bokeh.models import ( BoxAnnotation, CustomJS, Span, Toggle, ) from bokeh.io import ( output_file, output_notebook, show, ) from bokeh.plotting import ( figure, ColumnDataSource, ) from bokeh.models import ( CrosshairTool, HoverTool, PanTool, ResetTool, ResizeTool, SaveTool, UndoTool, WheelZoomTool, ) from bokeh.layouts import column from bokeh.resources import CDN from bokeh.embed import autoload_static
Some Constants
NATIONAL_COLOR = "slategrey" NATIONAL_LABEL = "National" PORTLAND_COLOR = "cornflowerblue" PORTLAND_LABEL = "Portland-Hillsboro-Vancouver" S_AND_P_COLOR = "#90151B" S_AND_P_LABEL = "S & P 500 Index" HOUSING_COLOR = "#D89159" HOUSING_LABEL = "House Price Index"
The Data
bokeh doesn't work with pandas DataFrame's (or at least I couldn't get it to work). Instead you create a DataFrame-like object using the ColumnDataSource.
portland_source = ColumnDataSource( data=dict( month_data=portland.datetime, unemployment=portland.unemployment_rate, month_label=portland.date, ) ) national_source = ColumnDataSource( data=dict( month_data=national.datetime, unemployment=national.unemployment_rate, month_label=national.date, ) ) housing_source = ColumnDataSource( data=dict( month_data=house_price_index.datetime, price=house_price_index.price, month_label=s_and_p_index.date, ) ) s_and_p_source = ColumnDataSource( data=dict( month_data=s_and_p_index.datetime, value=s_and_p_index.VALUE, month_label=s_and_p_index.date, ) )
The Tools
These are the things that add interactivity to the plot. You have to create new ones for each figure so I made a function to get them.
def make_tools(): """makes the tools for the figures Returns: list: tool objects """ hover = HoverTool(tooltips=[ ("month", "@month_label"), ("unemployment", "@unemployment"), ]) tools = [ hover, CrosshairTool(), PanTool(), ResetTool(), ResizeTool(), SaveTool(), UndoTool(), WheelZoomTool(), ] return tools
The HoverTool tooltips argument is a list of tuples - one tuple for each dimension of the data. The first argument of the tuple (e.g. "month") is the label that will appear when the user hover's over the data point, while the second (e.g. "@month_label") tells bokeh which column to use for the data (so it has to match the key you used in the ColumnDataSource creation).
Helper Functions
The sub-figures needed some common elements so I created functions for them.
Scaling The Timestamps
The timestamps by default are unreadable (because there are so many). This re-scales them so they are more readable.
def scale_timestamp(index): """gets the scaled timestamp for element location Args: index: index in the portland.datetime series Returns: epoch timestamp used to locate place in plot """ return portland.datetime[index].timestamp() * TIME_SCALE
Drawing the Recession
The recession is indicated as a blue box on each plot.
def make_recession(): """Makes the box for the recession Returns: BoxAnnotation to color the recession """ return BoxAnnotation( left=scale_timestamp(recession_start), right=scale_timestamp(recession_end), fill_color="blue", fill_alpha=0.1)
Vertical Lines
Things like the unemployment lows and highs are indicated by a vertical line.
def make_vertical(location, color="darkorange"): """makes a vertical line Args: location: place on the x-axis for the line color (str): line-color for the line Returns: Span at index """ return Span( location=location, line_color=color, dimension="height", )
Make Verticals
Since there's more than one line, this function adds all the lines.
def make_verticals(fig): """makes the verticals and adds them to the figures""" fig.add_layout(make_vertical( location=scale_timestamp(unemployment_peaks[0]), color="darkorange", )) fig.add_layout(make_vertical( location=scale_timestamp(s_and_p_nadir[0]), color="crimson")) fig.add_layout(make_vertical( location=scale_timestamp(housing_nadir[0]), color="limegreen")) fig.add_layout(make_vertical( location=scale_timestamp(national_peak[0][0]), color="grey")) return
The Figures
This plot has three sub-figures, each of which is created separately then added to the Column.
Unemployment
tools = make_tools() unemployment_figure = figure( plot_width=FIGURE_WIDTH, plot_height=FIGURE_HEIGHT, x_axis_type="datetime", tools=tools, title="Portland Unemployment (2007-2017)" )
Next the lines for the time-series data are added.
unemployment_figure.line( "month_data", "unemployment", source=portland_source, line_color=PORTLAND_COLOR, legend=PORTLAND_LABEL, ) line = unemployment_figure.line( "month_data", "unemployment", source=national_source, line_color=NATIONAL_COLOR, legend=NATIONAL_LABEL, )
Now the recession-box and high and low points for each plot is added.
unemployment_figure.add_layout(make_recession()) make_verticals(unemployment_figure)
Now some labels are added and the grid is turned off.
unemployment_figure.yaxis.axis_label = "% Unemployment" unemployment_figure.xaxis.axis_label = "Month" unemployment_figure.xgrid.visible = False unemployment_figure.ygrid.visible = False
S & P 500
The S & P 500 had didn't have unemployment as the dependent variable so I made a different set of tools to change the label for the hover.
hover = HoverTool(tooltips=[ ("Month", "@month_label"), ("Value", "@value"), ]) tools = [ hover, CrosshairTool(), PanTool(), ResetTool(), ResizeTool(), SaveTool(), UndoTool(), WheelZoomTool(), ] s_and_p_figure = figure( plot_width=FIGURE_WIDTH, plot_height=FIGURE_HEIGHT, x_range=unemployment_figure.x_range, x_axis_type="datetime", tools=tools, title="S & P 500 Index", ) line = s_and_p_figure.line("month_data", "value", source=s_and_p_source, line_color=S_AND_P_COLOR) s_and_p_figure.add_layout(make_recession()) make_verticals(s_and_p_figure) s_and_p_figure.yaxis.axis_label = "S & P 500 Valuation" s_and_p_figure.xaxis.axis_label = "Month" s_and_p_figure.xgrid.visible = False s_and_p_figure.ygrid.visible = False s_and_p_figure.legend.location = "bottom_right"
Housing
hover = HoverTool(tooltips=[ ("Month", "@month_label"), ("Price", "@price"), ]) tools = [ hover, CrosshairTool(), PanTool(), ResetTool(), ResizeTool(), SaveTool(), UndoTool(), WheelZoomTool(), ] housing_figure = figure( plot_width=FIGURE_WIDTH, plot_height=FIGURE_HEIGHT, x_range=unemployment_figure.x_range, x_axis_type="datetime", tools=tools, title="House Price Index", ) line = housing_figure.line("month_data", "price", source=housing_source, line_color=HOUSING_COLOR) housing_figure.add_layout(make_recession()) make_verticals(housing_figure) housing_figure.yaxis.axis_label = "Sale Price ($1,000)" housing_figure.xaxis.axis_label = "Month" housing_figure.xgrid.visible = False housing_figure.ygrid.visible = False housing_figure.legend.location = "bottom_right"
Combining
Once the figures were created I combined them into a column, since I wanted them stacked verticallly.
combined = column(unemployment_figure, s_and_p_figure, housing_figure)
Outputting The Code
In order to be able to embed the code, you need to have bokeh export it. There are multiple ways to do this, but I chose the autoload_static method.
OUTPUT_JAVASCRIPT = "portland_unemployment.js" js, tag = autoload_static(combined, CDN, OUTPUT_JAVASCRIPT)
The third argument (OUTPUT_JAVASCRIPT) is the path you want to refer to in the tag. The returned js variable contains the javascript you need to save (using the filename you gave autoload_static) and the tag contains the HTML tag that you embed to let the server know you want to use the javascript that was saved.
Since both values are just strings, and nothing was saved to disk, I saved it for later.
with open(OUTPUT_JAVASCRIPT, "w") as writer: writer.write(js) with open("portland_tag.html", 'w') as writer: writer.write(tag)
Getting It Into Nikola
The first thing was to create this file using nikola new_post (it's called bokeh-test.rst). Next I created a directory in the files folder that had the same name as this file (without the ".rst" extension) to put the javascript in so nikola would find it when I built the HTML.
mkdir files/posts/bokeh-test
Once I copied the portland_unemployment.js file to the bokeh-test directory I opened the portland_tag.html file and embedded it directly into the post sing the raw restructureText directive.
.. raw:: html <script src="portland_unemployment.js" id="686c5dd6-168a-4f7d-acbc-524875d93b59" data-bokeh-model-id="c473232a-dc2c-4b75-988c-f9bc6517b4b9" data-bokeh-doc-id="402d8e3c-1595-4d65-9f76-e11068c629ab" ></script>