#+BEGIN_COMMENT
.. title: bokeh org-mode
.. slug: bokeh-org-mode
.. date: 2018-04-25 21:59:50 UTC-07:00
.. tags: bokeh,org-mode,how-to
.. category: Visualization
.. link:
.. description: An experiment in getting bokeh working with org-mode.
.. type: text
#+END_COMMENT
* Introduction
This is an illustration of how to use bokeh with org-mode in nikola. There is a more extensive and flexible explanation of how to do this in [[http://cherian.net/posts/bokeh-org-mode.html][this post]] on [[http://cherian.net][cherian.net]] but I made these notes to understand how it works and to have a simpler example to refer to.
I was interested in doing this because I'm trying to re-work some of what I did for the Coursera *Data Science With Python* specialization by changing the data-sets and building them as blog posts. I might convert the posts to restructured text at some point, but while I'm working with them I'm using org-mode. Also, while most of the time I use [[https://matplotlib.org][matplotlib]] for plotting since this is going to be a blog-first approach I decided to go with [[https://bokeh.pydata.org/en/latest/][bokeh]]. I had [[https://necromuralist.github.io/data_science/posts/bokeh-test/][previously written]] about how to get *bokeh* into [[https://getnikola.com][Nikola]] using restructured text, but as an intermediate step I want to do the work in org-mode and still be able to see the output as I'm working.
The magic mix for this seems to be to use:
- [[https://getnikola.com][Nikola]] to build the HTML posts
- [[http://orgmode.org][org-mode]], an emacs mode to format the posts
- [[https://plugins.getnikola.com/v7/orgmode/][the orgmode-plugin]] for nikola
- [[https://github.com/gregsexton/ob-ipython][ob-ipython]] to get [[https://jupyter.org][jupyter/ipython]] in the org-mode posts
- [[https://bokeh.pydata.org/en/latest/docs/user_guide/quickstart.html#userguide-quickstart][bokeh]] to make the plots
* Creating the Bokeh Plot
** Imports
These are the dependencies. It's really all =bokeh=, =numpy= is just there to generate the data-values.
#+BEGIN_SRC python :session bokeh :results none
# from pypi
from bokeh.models import HoverTool
from bokeh.plotting import figure, ColumnDataSource
from bokeh.embed import autoload_static, file_html
import bokeh.resources
import numpy
#+END_SRC
I probably should save bokeh to this repository to keep the post from breaking in the future, but I'm lazy so I'm just going to import it from a CDN.
#+BEGIN_SRC python :session bokeh :results none
bokeh = bokeh.resources.CDN
#+END_SRC
** The Data
To get a simple example going I'm just going to use some random outputs generated by numpy.
#+BEGIN_SRC python :session bokeh :results none
X = numpy.arange(10)
Y = numpy.random.randint(0, 10, 10)
#+END_SRC
In order to create a data-structure that bokeh can use (similar to a [[https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html][pandas dataframe]]) you need to use a [[https://bokeh.pydata.org/en/0.10.0/docs/reference/models/sources.html][ColumnDataSource]].
#+BEGIN_SRC python :session bokeh :results none
source = ColumnDataSource(data=dict(
x=X,
y=Y,
desc=["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"],
))
#+END_SRC
The keys in the data-dict are essentially the column headers.
| Key | Description |
|------+----------------------------|
| x | the x-axis values |
| y | the y-axis values |
| desc | The labels for the tooltip |
Now to get some tool-tips to pop up when you hover over the plot, I'll create a =HoverTool=.
#+BEGIN_SRC python :session bokeh :results none
hover = HoverTool(tooltips=[
('index', '$index'),
('(x, y)', '($x, $y)'),
('desc', '@desc'),
])
#+END_SRC
The =tooltips= list maps the labels that will show up in the tooltip (the first argument to each tuple) to variables in the =ColumnDataSource= (if preceded by an =@=) or generated value (if preceded by a =$=) The =index= value is the index in the array where the data point sits (so for the first point it will be 0, the second will be 1, etc.). The =(x, y)= values are the coordinate locations of your pointer when you hover over the data points, and the =desc= will be replaced by the label I set in the =ColumnDataSource= for a particular data-point.
** The Plot
Now I'll create the actual plot (=figure=).
#+BEGIN_SRC python :session bokeh :results none
fig = figure(title="Random Example", x_axis_label="x", y_axis_label="y",
tools=[hover])
fig.line('x', 'y', source=source)
fig.circle('x', 'y', size=10, source=source)
#+END_SRC
* Getting the Bokeh Plot Into The Post
Finally I'll save the javascript and HTML files needed and then output the blob needed to embed the plot into this post. The =autoload_static= function takes the bokeh plot object (=fig=), the bokeh javascript that I loaded earlier (=bokeh=), and the name of the javascript file that you want it to creat (=test.js=) and returns the javascript to save (=javascript=) and the HTML fragment that will include the javascript (=source=). Note that because of the way nikola structures things I have to create a folder named =files/posts/bokeh-org-mode= and save the files there. Since nikola will automatically look in this folder the name you pass into =autoload_static= should just be the filename without the path, but when you save the javascript file you will save it there so you need to add the relative path. If my explanation seems a little convoluted, just look at the code below, it's fairly simple.
First I'll create a variable to hold the path to the folder to save the files in. All files for nikola posts go into sub-folders of =files/posts/= and since the source file for this post is called =bokeh-org-mode.org=, the files to include in it go into the folder =files/posts/bokeh-org-mode= (=files/posts/= plus the slug for the post).
#+BEGIN_SRC python :session bokeh :results none
FOLDER_PATH = "../files/posts/bokeh-org-mode/"
#+END_SRC
** The Javascript
Now, I'll create the javascript source for the plot.
#+BEGIN_SRC python :session bokeh :results none
FILE_NAME = "test.js"
javascript, source = autoload_static(fig, bokeh, FILE_NAME)
with open(FOLDER_PATH + FILE_NAME, "w") as writer:
writer.write(javascript)
#+END_SRC
The =javascript= variable holds the actual javascript source code (which then gets saved) while the =source= variable holds the string with the HTML to embed the javascript into this post (which I show at the end of this post).
** Embedding the Plot
Finally, we need to print out the string that is stored in the =source= variable which then tells org-mode to embed the files into this post. I'll output the full org-block so you can see the header arguments.
#+BEGIN_SRC org
#+BEGIN_SRC python :session bokeh :results output raw :exports results
print('''#+BEGIN_EXPORT html
{}
#+END_EXPORT'''.format(source))
,#+END_SRC
#+END_SRC
#+BEGIN_SRC python :session bokeh :results output raw :exports results
print('''#+BEGIN_EXPORT html
{}
#+END_EXPORT'''.format(source))
#+END_SRC
#+RESULTS:
#+BEGIN_EXPORT html
#+END_EXPORT
And there you have it. I don't have a lot to say about it, other than that if you hover over the data with your cursor and then look up above at the =ColumnDataSource= above, you'll see that the variables match the inputs
* Summary
To get a bokeh figure into an org-mode document in nikola:
1. Create the bokeh plot.
2. Create a folder in the =files/posts/= folder that matches the slug for the post.
3. Use =autoload_static= to convert the bokeh object to javascript and create the HTML tag to embed it.
4. Save the javascript in the =files/posts//= folder that you created
5. Print the HTML fragment in an org-mode =#+BEGIN_EXPORT html= block.