pip-tools and pipdeptree


I was looking for a way to update python dependencies that I'd installed with pip when I stumbled upon pip-tools. I'm not particularly good about keeping everything in sync and up-to-date so I'm hoping that this will make it easier to do and thus more likely that I'll do it. It's been a little while since I first used it and I had to look it up, so these are my notes to my future self.

First pipdeptree

pip-tools installs a command named pip-compile which will use either the requirements you put in your setup.py file or a special file named requirements.in (if you call it this you don't have to pass in the filename, otherwise you have to tell it where to look). Unless there's only a few requirements I prefer to use a separate file, rather than setup.py, since it makes it clearer and more likely that I'll keep it up to date. The requirements.in file is a list of your dependencies but unlike the requirements.txt file, it doesn't have version numbers, the version numbers are added when you call the pip-compile command.

So where does the requirements.in file come from? You have to make it. But if you're editing things by hand, doesn't this kind of make it less likely you'll maintain it? Yes, which is where pipdeptree comes in. pipdeptree will list all the python dependencies you installed as well as everything those dependencies pulled in as their dependencies. It's usefull to take a look at how a dependency you didn't directly install got into your virtual environment. You can install it from pypi.

pip install pipdeptree

Here's its help output.

pipdeptree -h
usage: pipdeptree [-h] [-v] [-f] [-a] [-l] [-u] [-w [{silence,suppress,fail}]]
                  [-r] [-p PACKAGES] [-j] [--json-tree]
                  [--graph-output OUTPUT_FORMAT]

Dependency tree of the installed python packages

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -f, --freeze          Print names so as to write freeze files
  -a, --all             list all deps at top level
  -l, --local-only      If in a virtualenv that has global access do not show
                        globally installed packages
  -u, --user-only       Only show installations in the user site dir
  -w [{silence,suppress,fail}], --warn [{silence,suppress,fail}]
                        Warning control. "suppress" will show warnings but
                        return 0 whether or not they are present. "silence"
                        will not show warnings at all and always return 0.
                        "fail" will show warnings and return 1 if any are
                        present. The default is "suppress".
  -r, --reverse         Shows the dependency tree in the reverse fashion ie.
                        the sub-dependencies are listed with the list of
                        packages that need them under them.
  -p PACKAGES, --packages PACKAGES
                        Comma separated list of select packages to show in the
                        output. If set, --all will be ignored.
  -j, --json            Display dependency tree as json. This will yield "raw"
                        output that may be used by external tools. This option
                        overrides all other options.
  --json-tree           Display dependency tree as json which is nested the
                        same way as the plain text output printed by default.
                        This option overrides all other options (except
  --graph-output OUTPUT_FORMAT
                        Print a dependency graph in the specified output
                        format. Available are all formats supported by
                        GraphViz, e.g.: dot, jpeg, pdf, png, svg

If you look at the options you can see that there's a --freeze option, that's what we'll be using. Let's look at some of what that looks like.

pipdeptree --freeze | head

So it looks like the output of pip freeze except it puts the packages you installed flush-left and then uses indentation to indicate what that package installed. In the example above, I installed Nikola, then Nikola installed doit, and doit installed cloudpickle and pyinotify. I kind of remember needing to install pyinotify myself, but maybe pydeptree caught that it was a dependency that doit is using. Anyway.

For our requirements.in file we only want the names, and although there might be a reason to keep the entire tree, I think it makes it easier to understand what I'm using if the file only holds the dependencies at the top-level (the ones that I'm using directly, rather than being a dependency of a dependency). So, we'll use a little grep. First, since I'm a python-programmer I'm going to give it the -P flag to use perl escape codes. Next, we want to only match the lines that have alpha-numeric characters as the first character in the line.

grep Description
-P, --perl-regexp Use perl regular expression syntax
^ Match the beggining of a line
\w Match alpha-numeric character and underscores
+ Match one or more

First, let's see how many total dependencies there are.

pipdeptree --freeze | wc -l
: 160

So there are 160 dependencies total. How many did I install?

pipdeptree --freeze | grep --perl-regexp "^\w+" | wc -l

Out of the 160 only 11 were directly installed by me.

So we're done, right? Not yet, we need to get rid of the == and version numbers. I hadn't known that grep had this feature, since I normally use python instead of grep, but grep has an --only-matching option that will discard the parts of the line that don't match.

grep Description
-o, --only-matching Only show the parts of the line that match
pipdeptree --freeze | grep --only-matching --perl-regexp "^\w+"

If you look at the first entry you'll notice it says ghp, but the actual name of the package is ghp-import, but the hyphen isn't part of the alpha-numeric set, so we'll have to add it.

grep Description
[] Match one or the entries in the brackets
pipdeptree --freeze | grep -oP "^[\w\-]+"

This looks like what we want, but there's a couple of things that we should take care of that would happen if this were for an installed package.

  • there's a bug in ubuntu that causes pip to include pkg-resources, which isn't something you can install.
  • it will add an extra entry for your python egg, something like this:
-e git+git@github.com:russell-n/iperflexer.git@65f4d3ca72670591f584efa6fa9bfd64c18a925b#egg=iperflexer

So we should filter those out.

grep Description
-v, --invert-match Return lines that don't match
pipdeptree --freeze | grep --only-matching --perl-regexp "^[\w\-]+" | grep --invert-match "\-e\|pkg"

There are probaby other exceptions that have to be added for other installations, but this looks like enough for us. Now we can redirect this to a requirements.in file and we're ready for pip-tools.

pipdeptree --freeze | grep --only-matching --perl-regexp "^[\w\-]+" | grep --invert-match "\-e\|pkg" > requirements.in


pip-compile will read in the requirements.in file and add the version numbers and can create a requirements.txt file. It will automatically look for the requirements.in file or you can explicitly pass in the filename.

pip-compile | head
# This file is autogenerated by pip-compile
# To update, run:
#    pip-compile --output-file requirements.txt requirements.in
argh==0.26.2              # via watchdog
backcall==0.1.0           # via ipython
bleach==2.1.3             # via nbconvert
blinker==1.4              # via nikola

You'll notice it adds in the dependencies of the dependencies and shows what requries them.

Well, that was a lot of work just for that.

If we stopped at this point we'd have:

  • a way to check who installed what using pipdeptree (as well as a way to plot the dependencies as a graph)
  • a way to separate out our dependencies into a separate file (requirements.in) to make it easier to read
  • a way to create our requirements.txt file using our requirements.in file

I think that's kind of nice already, especially if you end up with a lot of dependencies. Try working with sphinx and scikit-learn and you'll see things start to explode. But of course, there's always more.


You can run pip-compile with the --upgrade option to try and update dependencies whenever you want to make sure you have the latest of everything (you can do it per-package too, but nah).

pip-compile --upgrade | head
# This file is autogenerated by pip-compile
# To update, run:
#    pip-compile --output-file requirements.txt requirements.in
argh==0.26.2              # via watchdog
backcall==0.1.0           # via ipython
bleach==2.1.3             # via nbconvert
blinker==1.4              # via nikola

This will upgrade your installation but not update the requirements.txt file, so you can test it out and see if everything works before updating the requirements.txt. If things don't work out, you could reinstall from the requirements.txt file, but see the next section for another way.


pip-tools also installed a command called pip-sync which will keep you in sync with what is in the requirements file, so as long as requirements.txt is always a working version, you can sync up with it to avoid problems with changes in any of the dependencies. This is different from the --upgrade option in that it will only install the exact version in the requirements file.

Collecting backcall==0.1.0
Collecting bleach==2.1.3
  Using cached https://files.pythonhosted.org/packages/30/b6/a8cffbb9ab4b62b557c22703163735210e9cd857d533740c64e1467d228e/bleach-2.1.3-py2.py3-none-any.whl
Collecting certifi==2018.4.16
  Using cached https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl
Collecting cloudpickle==0.5.3
  Using cached https://files.pythonhosted.org/packages/e7/bf/60ae7ec1e8c6742d2abbb6819c39a48ee796793bcdb7e1d5e41a3e379ddd/cloudpickle-0.5.3-py2.py3-none-any.whl
Successfully installed backcall-0.1.0 bleach-2.1.3 certifi-2018.4.16 cloudpickle-0.5.3 decorator-4.3.0 doit-0.31.1 ipykernel-4.8.2 ipython-6.4.0 jedi-0.12.0 jupyter-client-5.2.3 logbook-1.4.0 lxml-4.2.1 natsort-5.3.2 nikola-7.8.15 notebook-5.5.0 parso-0.2.1 pexpect-4.6.0 pillow-5.1.0 python-dateutil-2.7.3 send2trash-1.5.0 tornado-5.0.2 virtualenv-16.0.0 virtualfish-1.0.6 wheel-0.31.1 ws4py-0.5.1

Since I upgraded the installation the requirements.txt file is now behind the latests versions so by syncing I undid the upgrade. This time I'll upgrade again and save the output.

pip-compile --upgrade

So now the file and my installation should be in sync.

: Everything up-to-date


So there you have it, how to keep dependencies synced. The README for pip-tools is much briefer, but I thought I'd add a little more detail to the part of it that I plan to use the most.

Setting Up the RTL 8812AU Realtek USB Adapter on a Raspberry Pi 3


I'm trying to set-up a wireless packet monitor (it's something I've long thought might be an interesting source of data, and now I need it for work too). My thought was that I'd set up a raspberry pi to experiment with - I don't think it is powerful enough to work, but it should work just to mess with code, and a distributed system might get some interesting results, but anyway - but when I tried to put my raspberry pi's wireless interface into monitor mode I got an error.

iwconfig wlan0 mode monitor
Error for wireless request "Set Mode" (8B06) :
    SET failed on device wlan0 ; Operation not supported.

Looking around on the web I found this reddit post as well as some Stack Overflow posts that said that monitor mode isn't supported on the Raspberry Pi. There is a project called nexmon that apparently lets up add a firmware patch to enable it, which I'll probably try later, but before I tried that I remembered that I have a Realtek 8812AU USB WiFi adapter that I bought a while ago for an old desktop I had that I wasn't using so I decided to try it.

What I tried


The first thing I did was to see if it would just work. I plugged the Realtek into the USB port and although lsusb showed it, iwconfig didn't show it as an interface. Back to the internet.


Next I found a repository on github that has the driver for the Realtek set up for linux machines. I downloaded it and followed the instructions to build it - the main thing is to set:


in the Makefile - but when I tried to build it I got this error.

sudo dkms install -m $DRV_NAME -v $DRV_VERSION

'make' KVER=4.4.38-v7+....(bad exit status: 2)
ERROR (dkms apport): binary package for rtl8812AU: 4.3.20 not found
Error! Bad return status for module build on kernel: 4.4.38-v7+ (armv7l)
Consult /var/lib/dkms/rtl8812AU/4.3.20/build/make.log for more information.

There was also a message in the make.log file but I didn't remember to copy it.

What fixed it

The solution was in this StackOverflow post - the make program is being pointed to a folder named arm7l (thats 'arm seven ell') but it should actually be pointed to one named arm. The simple solution is to create an alias with the correct name.

sudo ln -s /usr/src/linux-headers-4.4.38-v7+/arch/arm/ /usr/src/linux-headers-4.4.38-v7+/arch/armv7l

This turns out to fix the build problem and after a reboot the network interface showed up.


The Raspberry Pi 3 doesn't support monitor mode for its wireless interface out of the box, and while there is a firmware patch to enable it, I chose to use a Realtek RTL 8812AU USB WiFi adapter instead. You need a little bit of extra work to get it going, but it does seem to work. One thing I noticed is that iwconfig will put it in monitor mode but airmon-ng doesn't (I haven't figured out why yet). It doesn't report an error, it just doesn't seem to work. Also, iw always reports the interface as managed, even when it isn't… maybe I'll try the firmware patch after all.

Packet Capturing Bibliography

Packet Capturing

  • [R3] Schulman A, Levin D, Spring N. On the fidelity of 802.11 packet traces. InInternational Conference on Passive and Active Network Measurement 2008 Apr 29 (pp. 132-141). Springer, Berlin, Heidelberg. (PDF Link)
  • [R4] Kashyap A, Paul U, Das SR. Deconstructing interference relations in WiFi networks. InSensor Mesh and Ad Hoc Communications and Networks (SECON), 2010 7th Annual IEEE Communications Society Conference on 2010 Jun 21 (pp. 1-9). IEEE. (PDF Link)
  • [R5] Dujovne D, Turletti T, Filali F. A taxonomy of IEEE 802.11 wireless parameters and open source measurement tools. IEEE Communications Surveys & Tutorials. 2010 Apr 1;12(2):249-62. (PDF Link)
  • [R6] Serrano P, Zink M, Kurose J. Assessing the fidelity of COTS 802.11 sniffers. InINFOCOM 2009, IEEE 2009 Apr 19 (pp. 1089-1097). IEEE. (PDF Link)

Merging PCAP Files


  • [R<number>] - Research Paper

Networking Monitoring Tools

Packet Capturing


Gulp purports to be better at capturing packets than tcpdump (although they can work together).

There is more than one version out there:

- This one says it applied a patch to it five years ago.

  • This one says it is the original but hasn't been updated in six years.
  • This blog post has updated versions of it including one in 2017 that says it has a major bug fix (but I don't know if it's a gulp bug or not)


More easily obtainable and better documentation available (although still not enough).


Captures packets and decodes SSL/TLS packets.

Packet Examining

Compressed PCAP Packet Indexing Prograpm (cppip)

This adds indexing to bgzip compressed LibPCAP files which then lets you extract them while the original files are still compressed.


This lets you extract part of or combine files created by tcpdump when using file rotation.


Describes itself as like GNU grep but for packets.

pylibpcap, pypcap

Python code to work with libpcap.


These are installed when you install wireshark.


Packet capturing and examining (better documented than most of the other programs)


Reorders the packets by timestamp.


This prints summary information about packe files (works with gzipped files).


Merges multiple packet files together. Mergecap will try to keep timestamps in order when merging, but it assumes each individual file to merge is already in order.

Packet Flows


Track, reassemble, reorder TCP streams.


Gives connection information taken from a capture file.


Separates out TCP flows into separate files.

Capture Summarizing


Summarizes packet information in ASCII format


Gives summary statistics for a pcap file

Network Monitoring


Like top but for the network.



Read from and write to TCP/UDP network connections.


Route data between byte streams.

The Flask Debug Server


This is documented on the Flask site, but I was trying to help someone debug some old server code that I'd written and couldn't remember how to debug it, so I'm documenting it here as I go through remembering it again so I'll have a single reference to use the next time. Some of the settings look different from what I remember using so I think that Flask has changed a little over time, but since I didn't document it the first time I don't have a record to compare against (well, I probably have some notes in a notebook but it's hard to refer to that when sitting at someone else's desk).


The Flask Quickstart tells you what to do, but for some reason when we googled it, the instructions were different, I think it might have lead us to an older form of the documentation. This is the current version (May 20, 2018.)

The Environment Variables

The Flask App

First you have to tell flask which file contains your flask app by setting the FLASK_APP environment variable. In my case I'm using connexion, an oddly-named adapter that adds support for swagger/OpenApi to Flask. So the file that has the app has this line in it.

application = connexion.FlaskApp(__name__)

In this case that's a file named api.py which we'll say is in the server folder (it isn't, but that's okay) so we need to set our environment accordingly. I use the fish shell so the syntax is slightly different from the Quick Start example. Also - and this caused me a lot of trouble - when I didn't pass in the name of my FlaskApp instance I got this error:

Error: Failed to find application in module "server.api".  Are you sure it contains a Flask application?  Maybe you wrapped it in a WSGI middleware or you are using a factory function.

So I had to specifically tell flask the name of my app by appending it to the end of the setting (perhaps it is looking for app specifically, but I called mine application).

set -x FLASK_APP server.api:application

Development Mode

If you want the server to automatically reload when it detects changes then you should to set the FLASK_ENV environment variable to development. This is similar to using FLASK_DEBUG but I think it adds the reloading. Anyway, it does more than just set debug mode.

set -x FLASK_ENV development

Run It

The Development server

This is the output of the help string for the development server, note that it uses -h for host so you need to pass in --help to see this output or you will get an error.

flask run --help

Public Server

The default server runs on localhost, but since I'm hosting the code on a raspberry pi sitting on the network somewhere but debugging it remotely, I need to run it on a public address.

flask run --host=

Make it repeatable

Monitor Mode With airmon-ng


I'm looking into setting up a wireless (WiFi and bluetooth) monitoring station to collect data that correlates with how my network is performing and what the state of the network is and I thought that, as a first step, I'd get some packet capturing logs going. I'm primarily a python programmer who's kept my toe in the Linux command-line world but it's been a little while since I really dove into the wireless networking world. I had some vague notion about doing it with iwconfig or iw, but then I found airmon-ng and realized that it was what I was really looking for. Why is it better? Well, to be honest, I'm not informed enough to say that it's better, but when I tried to use iw it failed without really telling me why, while airmon-ng not only didn't fail, but it told me that there were other processes already using my wireless interface which is likely why iw failed and it told me how to fix it. On the one hand, since it's hiding so much from you airmon-ng lets you be a little ignorant and still do stuff, on the other - what's wrong with that?

Setting Up

I'm using Ubuntu 18.04 (Bionic Beaver) - which seems to have both fixed and broken a surprising amount of stuff (nice that you let me log in with Dvorak now, but maybe you should let me know the keyboard layout has changed ahead of time) - so these instructions are based on that. First, airmon-ng is part of the aircrack-ng package so you need to install it to get what we want.

sudo apt install aircrack-ng

Once you do this you'll see that airmon-ng is installed.

which airmon-ng

Interestingly, if you check it out, you'll see that all it is is a bash script.

file `which airmon-ng`

The file is kind of long.

wc -l `which airmon-ng`

So I won't list it here - you can check it out if you're interested. It's actually very informative if you want to learn how to do this kind of stuff, but for this case, we just need to know it works.

Monitor Mode

Starting Up Monitor Mode

Finding your interface

In the good old days you could be pretty sure that your wireless interface was wlan0 (assuming you only had one) but then ubuntu/freedesktop went and changed things so now you should probably check what your interface name is using iw.

iw dev

So it looks like we have a wireless interface named wlp2s0 that we want to change from managed to monitor mode.

Okay, now monitor it

The syntax to start monitor mode is airmon-ng start <interface>.

sudo airmon-ng start wlp2s0
Found 5 processes that could cause trouble.
If airodump-ng, aireplay-ng or airtun-ng stops working after
a short period of time, you may want to run 'airmon-ng check kill'

  PID Name
 1505 wpa_supplicant
 1524 NetworkManager
 1541 avahi-daemon
 1748 avahi-daemon
 2298 dhclient

PHY     Interface       Driver          Chipset

phy0    wlp2s0          iwlwifi         Intel Corporation Wireless 7260 (rev 73)

                (mac80211 monitor mode vif enabled for [phy0]wlp2s0 on [phy0]wlp2s0mon)
                (mac80211 station mode vif disabled for [phy0]wlp2s0)

The first thing you should notice is that there are five potentially interfering processes. This is probably what interferes with the =iw= method, but we'll leave it alone and see if it works. Why don't we check on the interface.

#+BEGIN_SRC bash :results raw
iw dev

	Interface wlp2s0mon
		ifindex 5
		wdev 0x3
		addr 7c:5c:f8:f7:f5:c6
		type monitor
		channel 10 (2457 MHz), width: 20 MHz (no HT), center1: 2457 MHz
		txpower 0.00 dBm

So you can see that running =airmon-ng start= killed our original =wlp2s0= interface and replaced it with =wlp2s0mo= which is in monitor mode on channel 10. Unforturnately I wanted channel 6 but forgot to specify it. Let's try that again.

The first thing we have to do is to turn off monitor mode.

sudo airmon-ng stop wlp2s0mon

Note that we are stopping the new monitor-mode interface, not our original wireless interface. Now we can start the monitor-mode interface set to channel 6. The syntax is airmon-ng start <interface> <channel>.

sudo airmon-ng start wlp2s0 6

There's some output from the command, but we want to know what =iw= thinks is going on.

#+BEGIN_SRC bash :results raw
iw dev

	Interface wlp2s0mon
		ifindex 7
		wdev 0x6
		addr 7c:5c:f8:f7:f5:c6
		type monitor
		channel 6 (2437 MHz), width: 20 MHz (no HT), center1: 2437 MHz
		txpower 0.00 dBm

So now we have an interface (=wlp2s0mon=) on channel 6 in monitor mode. We can make sure that it's working using [[https://tcpdump.org][tcpdump]].

sudo tcpdump -i wlp2s0mon -n

Note that we need to use the new interface name. Also, if it wasn't obvious up to now, putting the interface into monitor mode will break any networking capabilities for that interface on your computer (so if it was your internet connection, don't expect to access the web when it's in monitor mode).

Cleaning Up

We already got a preview of turning off monitor mode earlier. The syntax is airmon-ng stop <interface>.

sudo airmon-ng stop wlp2s0mon

This will bring back the original wireless interface, but it won't (likely) re-establish your connection to your wireless access point. To get back onto the network you will probably need to open network manager and go through the setup process again.


These were my notes on setting up monitor mode using airmon-ng. The main point I wanted to get across is how easy it is to do using airmon-ng as opposed to the other methods. I didn't actually show how much harder it is to use iwconfig, but if you have tried you might know what it entails. In any case, hopefully these notes will help me in the future as I keep watching the packets.

TCP Dump Notes

These are notes I made while surfing the web looking into TCP Dump. You will most likely need to use sudo to run most of the commands, but I'm leaving it off to make it shorter.

About TCP Dump

  • It has more filtering capabilities and can filter while capturing packets, but it doesn't have the analytical tools that something like wireshark has1.

Some Examples

Listing interfaces

You can ask tcpdump which interfaces it is able to listen to2.

tcpdump -D

Capture packets on an interface

To capture packets on an interface you pass its name to the -i flag2 (here the interface I'll use is eno1).

tcpdump -i eno1

Save the packet capture to a file

The default behavior is for tcpdump to send the output to standard output, to have it save the packets to a files use the -w flag2 (you can call it anything, I'll call it dump.pcap).

tcpdump -i eno1 -w dump.pcap

Increase the verbosity of the capture

To increase the amount if information that's captured, pass multiple v arguments2 (in this case I'll use -vvv).

tcpdump -i eno1 -vvv -w dump.pcap


By IP address

You can get all the packets being sent or received by a host using the host argument3.

tcpdump host

By Sender IP Address

You can filter out all the packets except those that are being sent by a host using the src host argument2.

tcpdump -i eno1 src host

You can leave off the host argument and just use src3

By Target IP Address

To filter out all the packets except those that are going to a specific target use the dst host argument2.

tcpdump -i eno1 dst host

Sender and Target IP Addresses

You can combine parameters using the logical operators and, or, and not3.

tcpdump 'src and dst'

The single quotes are optional and are just used to group the arguments together.

By Subnet

You can grab all the packets on a network or subnet using the net argument and CIDR notation3. This example grabs all the packets on the 192.168.1.* subnet.

tcpdump net

By port and/or protocol

If you want to only catch activity on a certain port and by a certain protocol then you use the port argument and the name of the protocol (e.g. udp)3. This would catch all the tcp traffic over SSH.

tcpdump tcp port 22

You can use tcp, udp, or icmp for the protocols and add multiple ports using a comma4.

tcpdump tcp port 22,80

Turn off hostname and port translation

The default behavior for tcpdump is to translate the hostnames and ports to something human-readable if possible. To turn this off you pass in the -n argument3. Since this stops having to look things up it will reduce the amount of overhead needed by tcpdump.

tcpdump -n -i eno1 port 22





Diogenes, Y. & Ozkaya, E. (2018). Cybersecurity, Attack and Defense Strategies : infrastructure security with Red Team and Blue Team tactics. Birmingham, UK: Packt Publishing.]


Johansen, G. (2017). Digital forensics and incident response : an intelligent way to respond to attacks. Birmingham, UK: Packt Publishing.


Beltrame, J. (2017). Penetration testing bootcamp : quickly get up and running with pentesting techniques. Birmingham, UK: Packt Publishing.


McPhee. & Beltrame, J. (2016). Penetration testing with Raspberry Pi : learn the art of building a low-cost, portable hacking arsenal using Raspberry Pi 3 and Kali Linux 2. Birmingham, UK: Packt Publishing.


Baxter, J., Orzach, Y. & Mishra, C. (2017). Wireshark revealed : essential skills for IT professionals : get up and running with Wireshark to analyze your network effectively. Birmingham, UK: Packt Publishing.

MNIST Digits With Keras


These are the parts that will make up the model.

The Sequential Model

The Keras Sequential Model is a stack of layers that will make up the neural network.

from keras.models import Sequential

The Dense Layers

The Keras Dense layer is a densely-connected layer within our neural network.

from keras.layers.core import Activation


The Activation represents the activation function for each layer (e.g. relu).

from keras.layers.core import Activation


To tune the model to the data we'll use the Adam optimizer

from keras.optimizers import Adam

Categorical Converter

Finally, since our problem is a classification problem (identify which of 10 digits an image represents) I'll import the Keras to_categorical function to enable classification of our data.

from keras.utils import np_utils

The MNIST dataset is made up of human-classified hand-written digits. Keras includes it as part of their installation so we can load it directly from keras.

from keras.datasets import mnist

We're going to use numpy to reshape the data.

import numpy

To make our output the same every time, I'll set the random seed to April 28, 2018 as a string of digits.


bokeh org-mode


This is an illustration of how to use bokeh with org-mode in nikola. There is a more extensive and flexible explanation of how to do this in this post on cherian.net but I made these notes to understand how it works and to have a simpler example to refer to.

I was interested in doing this because I'm trying to re-work some of what I did for the Coursera Data Science With Python specialization by changing the data-sets and building them as blog posts. I might convert the posts to restructured text at some point, but while I'm working with them I'm using org-mode. Also, while most of the time I use matplotlib for plotting since this is going to be a blog-first approach I decided to go with bokeh. I had previously written about how to get bokeh into Nikola using restructured text, but as an intermediate step I want to do the work in org-mode and still be able to see the output as I'm working.

The magic mix for this seems to be to use:

Creating the Bokeh Plot


These are the dependencies. It's really all bokeh, numpy is just there to generate the data-values.

# from pypi
from bokeh.models import HoverTool
from bokeh.plotting import figure, ColumnDataSource
from bokeh.embed import autoload_static, file_html
import bokeh.resources
import numpy

I probably should save bokeh to this repository to keep the post from breaking in the future, but I'm lazy so I'm just going to import it from a CDN.

bokeh = bokeh.resources.CDN

The Data

To get a simple example going I'm just going to use some random outputs generated by numpy.

X = numpy.arange(10)
Y = numpy.random.randint(0, 10, 10)

In order to create a data-structure that bokeh can use (similar to a pandas dataframe) you need to use a ColumnDataSource.

source = ColumnDataSource(data=dict(
    desc=["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"],

The keys in the data-dict are essentially the column headers.

Key Description
x the x-axis values
y the y-axis values
desc The labels for the tooltip

Now to get some tool-tips to pop up when you hover over the plot, I'll create a HoverTool.

hover = HoverTool(tooltips=[
    ('index', '$index'),
    ('(x, y)', '($x, $y)'),
    ('desc', '@desc'),

The tooltips list maps the labels that will show up in the tooltip (the first argument to each tuple) to variables in the ColumnDataSource (if preceded by an @) or generated value (if preceded by a $) The index value is the index in the array where the data point sits (so for the first point it will be 0, the second will be 1, etc.). The (x, y) values are the coordinate locations of your pointer when you hover over the data points, and the desc will be replaced by the label I set in the ColumnDataSource for a particular data-point.

The Plot

Now I'll create the actual plot (figure).

fig = figure(title="Random Example", x_axis_label="x", y_axis_label="y",
fig.line('x', 'y', source=source)
fig.circle('x', 'y', size=10, source=source)

Getting the Bokeh Plot Into The Post

Finally I'll save the javascript and HTML files needed and then output the blob needed to embed the plot into this post. The autoload_static function takes the bokeh plot object (fig), the bokeh javascript that I loaded earlier (bokeh), and the name of the javascript file that you want it to creat (test.js) and returns the javascript to save (javascript) and the HTML fragment that will include the javascript (source). Note that because of the way nikola structures things I have to create a folder named files/posts/bokeh-org-mode and save the files there. Since nikola will automatically look in this folder the name you pass into autoload_static should just be the filename without the path, but when you save the javascript file you will save it there so you need to add the relative path. If my explanation seems a little convoluted, just look at the code below, it's fairly simple.

First I'll create a variable to hold the path to the folder to save the files in. All files for nikola posts go into sub-folders of files/posts/ and since the source file for this post is called bokeh-org-mode.org, the files to include in it go into the folder files/posts/bokeh-org-mode (files/posts/ plus the slug for the post).

FOLDER_PATH = "../files/posts/bokeh-org-mode/"

The Javascript

Now, I'll create the javascript source for the plot.

FILE_NAME = "test.js"
javascript, source = autoload_static(fig, bokeh, FILE_NAME)

with open(FOLDER_PATH + FILE_NAME, "w") as writer:

The javascript variable holds the actual javascript source code (which then gets saved) while the source variable holds the string with the HTML to embed the javascript into this post (which I show at the end of this post).

Embedding the Plot

Finally, we need to print out the string that is stored in the source variable which then tells org-mode to embed the files into this post. I'll output the full org-block so you can see the header arguments.

#+BEGIN_SRC python :session bokeh :results output raw :exports results
print('''#+BEGIN_EXPORT html

And there you have it. I don't have a lot to say about it, other than that if you hover over the data with your cursor and then look up above at the ColumnDataSource above, you'll see that the variables match the inputs


To get a bokeh figure into an org-mode document in nikola:

  1. Create the bokeh plot.
  2. Create a folder in the files/posts/ folder that matches the slug for the post.
  3. Use autoload_static to convert the bokeh object to javascript and create the HTML tag to embed it.
  4. Save the javascript in the files/posts/<slug>/ folder that you created
  5. Print the HTML fragment in an org-mode #+BEGIN_EXPORT html block.