TensorFlow Tutorials
Table of Contents
Github
These are tutorials I found on github.
These are tutorials I found on github.
I was looking for a way to update python dependencies that I'd installed with pip when I stumbled upon pip-tools. I'm not particularly good about keeping everything in sync and up-to-date so I'm hoping that this will make it easier to do and thus more likely that I'll do it. It's been a little while since I first used it and I had to look it up, so these are my notes to my future self.
pip-tools
installs a command named pip-compile
which will use either the requirements you put in your setup.py
file or a special file named requirements.in
(if you call it this you don't have to pass in the filename, otherwise you have to tell it where to look). Unless there's only a few requirements I prefer to use a separate file, rather than setup.py
, since it makes it clearer and more likely that I'll keep it up to date. The requirements.in
file is a list of your dependencies but unlike the requirements.txt
file, it doesn't have version numbers, the version numbers are added when you call the pip-compile
command.
So where does the requirements.in
file come from? You have to make it. But if you're editing things by hand, doesn't this kind of make it less likely you'll maintain it? Yes, which is where pipdeptree
comes in. pipdeptree
will list all the python dependencies you installed as well as everything those dependencies pulled in as their dependencies. It's usefull to take a look at how a dependency you didn't directly install got into your virtual environment. You can install it from pypi.
pip install pipdeptree
Here's its help output.
pipdeptree -h
usage: pipdeptree [-h] [-v] [-f] [-a] [-l] [-u] [-w [{silence,suppress,fail}]] [-r] [-p PACKAGES] [-j] [--json-tree] [--graph-output OUTPUT_FORMAT] Dependency tree of the installed python packages optional arguments: -h, --help show this help message and exit -v, --version show program's version number and exit -f, --freeze Print names so as to write freeze files -a, --all list all deps at top level -l, --local-only If in a virtualenv that has global access do not show globally installed packages -u, --user-only Only show installations in the user site dir -w [{silence,suppress,fail}], --warn [{silence,suppress,fail}] Warning control. "suppress" will show warnings but return 0 whether or not they are present. "silence" will not show warnings at all and always return 0. "fail" will show warnings and return 1 if any are present. The default is "suppress". -r, --reverse Shows the dependency tree in the reverse fashion ie. the sub-dependencies are listed with the list of packages that need them under them. -p PACKAGES, --packages PACKAGES Comma separated list of select packages to show in the output. If set, --all will be ignored. -j, --json Display dependency tree as json. This will yield "raw" output that may be used by external tools. This option overrides all other options. --json-tree Display dependency tree as json which is nested the same way as the plain text output printed by default. This option overrides all other options (except --json). --graph-output OUTPUT_FORMAT Print a dependency graph in the specified output format. Available are all formats supported by GraphViz, e.g.: dot, jpeg, pdf, png, svg
If you look at the options you can see that there's a --freeze
option, that's what we'll be using. Let's look at some of what that looks like.
pipdeptree --freeze | head
So it looks like the output of pip freeze
except it puts the packages you installed flush-left and then uses indentation to indicate what that package installed. In the example above, I installed Nikola, then Nikola installed doit, and doit installed cloudpickle and pyinotify. I kind of remember needing to install pyinotify
myself, but maybe pydeptree
caught that it was a dependency that doit
is using. Anyway.
For our requirements.in
file we only want the names, and although there might be a reason to keep the entire tree, I think it makes it easier to understand what I'm using if the file only holds the dependencies at the top-level (the ones that I'm using directly, rather than being a dependency of a dependency). So, we'll use a little grep. First, since I'm a python-programmer I'm going to give it the -P
flag to use perl escape codes. Next, we want to only match the lines that have alpha-numeric characters as the first character in the line.
grep | Description |
---|---|
-P , --perl-regexp |
Use perl regular expression syntax |
^ |
Match the beggining of a line |
\w |
Match alpha-numeric character and underscores |
+ |
Match one or more |
First, let's see how many total dependencies there are.
pipdeptree --freeze | wc -l
: 160
So there are 160 dependencies total. How many did I install?
pipdeptree --freeze | grep --perl-regexp "^\w+" | wc -l
Out of the 160 only 11 were directly installed by me.
So we're done, right? Not yet, we need to get rid of the ==
and version numbers. I hadn't known that grep had this feature, since I normally use python instead of grep, but grep has an --only-matching
option that will discard the parts of the line that don't match.
grep |
Description |
---|---|
-o , --only-matching |
Only show the parts of the line that match |
pipdeptree --freeze | grep --only-matching --perl-regexp "^\w+"
If you look at the first entry you'll notice it says ghp
, but the actual name of the package is ghp-import
, but the hyphen isn't part of the alpha-numeric set, so we'll have to add it.
grep | Description |
---|---|
[] |
Match one or the entries in the brackets |
pipdeptree --freeze | grep -oP "^[\w\-]+"
This looks like what we want, but there's a couple of things that we should take care of that would happen if this were for an installed package.
pkg-resources
, which isn't something you can install.-e git+git@github.com:russell-n/iperflexer.git@65f4d3ca72670591f584efa6fa9bfd64c18a925b#egg=iperflexer
So we should filter those out.
grep |
Description |
---|---|
-v , --invert-match |
Return lines that don't match |
pipdeptree --freeze | grep --only-matching --perl-regexp "^[\w\-]+" | grep --invert-match "\-e\|pkg"
ghp-import2 graphviz Nikola notebook pip-tools pipdeptree virtualfish watchdog webassets wheel ws4py
There are probaby other exceptions that have to be added for other installations, but this looks like enough for us. Now we can redirect this to a requirements.in
file and we're ready for pip-tools
.
pipdeptree --freeze | grep --only-matching --perl-regexp "^[\w\-]+" | grep --invert-match "\-e\|pkg" > requirements.in
pip-compile
will read in the requirements.in
file and add the version numbers and can create a requirements.txt
file. It will automatically look for the requirements.in
file or you can explicitly pass in the filename.
pip-compile | head
# # This file is autogenerated by pip-compile # To update, run: # # pip-compile --output-file requirements.txt requirements.in # argh==0.26.2 # via watchdog backcall==0.1.0 # via ipython bleach==2.1.3 # via nbconvert blinker==1.4 # via nikola
You'll notice it adds in the dependencies of the dependencies and shows what requries them.
If we stopped at this point we'd have:
pipdeptree
(as well as a way to plot the dependencies as a graph)requirements.in
) to make it easier to readrequirements.txt
file using our requirements.in
fileI think that's kind of nice already, especially if you end up with a lot of dependencies. Try working with sphinx and scikit-learn and you'll see things start to explode. But of course, there's always more.
You can run pip-compile
with the --upgrade
option to try and update dependencies whenever you want to make sure you have the latest of everything (you can do it per-package too, but nah).
pip-compile --upgrade | head
# # This file is autogenerated by pip-compile # To update, run: # # pip-compile --output-file requirements.txt requirements.in # argh==0.26.2 # via watchdog backcall==0.1.0 # via ipython bleach==2.1.3 # via nbconvert blinker==1.4 # via nikola
This will upgrade your installation but not update the requirements.txt
file, so you can test it out and see if everything works before updating the requirements.txt
. If things don't work out, you could reinstall from the requirements.txt
file, but see the next section for another way.
pip-tools
also installed a command called pip-sync
which will keep you in sync with what is in the requirements file, so as long as requirements.txt
is always a working version, you can sync up with it to avoid problems with changes in any of the dependencies. This is different from the --upgrade
option in that it will only install the exact version in the requirements file.
pip-sync
Collecting backcall==0.1.0 Collecting bleach==2.1.3 Using cached https://files.pythonhosted.org/packages/30/b6/a8cffbb9ab4b62b557c22703163735210e9cd857d533740c64e1467d228e/bleach-2.1.3-py2.py3-none-any.whl Collecting certifi==2018.4.16 Using cached https://files.pythonhosted.org/packages/7c/e6/92ad559b7192d846975fc916b65f667c7b8c3a32bea7372340bfe9a15fa5/certifi-2018.4.16-py2.py3-none-any.whl Collecting cloudpickle==0.5.3 Using cached https://files.pythonhosted.org/packages/e7/bf/60ae7ec1e8c6742d2abbb6819c39a48ee796793bcdb7e1d5e41a3e379ddd/cloudpickle-0.5.3-py2.py3-none-any.whl Successfully installed backcall-0.1.0 bleach-2.1.3 certifi-2018.4.16 cloudpickle-0.5.3 decorator-4.3.0 doit-0.31.1 ipykernel-4.8.2 ipython-6.4.0 jedi-0.12.0 jupyter-client-5.2.3 logbook-1.4.0 lxml-4.2.1 natsort-5.3.2 nikola-7.8.15 notebook-5.5.0 parso-0.2.1 pexpect-4.6.0 pillow-5.1.0 python-dateutil-2.7.3 send2trash-1.5.0 tornado-5.0.2 virtualenv-16.0.0 virtualfish-1.0.6 wheel-0.31.1 ws4py-0.5.1
Since I upgraded the installation the requirements.txt
file is now behind the latests versions so by syncing I undid the upgrade. This time I'll upgrade again and save the output.
pip-compile --upgrade
So now the file and my installation should be in sync.
pip-sync
: Everything up-to-date
So there you have it, how to keep dependencies synced. The README for pip-tools is much briefer, but I thought I'd add a little more detail to the part of it that I plan to use the most.
I'm trying to set-up a wireless packet monitor (it's something I've long thought might be an interesting source of data, and now I need it for work too). My thought was that I'd set up a raspberry pi to experiment with - I don't think it is powerful enough to work, but it should work just to mess with code, and a distributed system might get some interesting results, but anyway - but when I tried to put my raspberry pi's wireless interface into monitor mode I got an error.
iwconfig wlan0 mode monitor Error for wireless request "Set Mode" (8B06) : SET failed on device wlan0 ; Operation not supported.
Looking around on the web I found this reddit post as well as some Stack Overflow posts that said that monitor mode isn't supported on the Raspberry Pi. There is a project called nexmon that apparently lets up add a firmware patch to enable it, which I'll probably try later, but before I tried that I remembered that I have a Realtek 8812AU USB WiFi adapter that I bought a while ago for an old desktop I had that I wasn't using so I decided to try it.
The first thing I did was to see if it would just work. I plugged the Realtek into the USB port and although lsusb
showed it, iwconfig
didn't show it as an interface. Back to the internet.
Next I found a repository on github that has the driver for the Realtek set up for linux machines. I downloaded it and followed the instructions to build it - the main thing is to set:
CONFIG_PLATFORM_I386_PC = n CONFIG_PLATFORM_ARM_RPI = y
in the Makefile
- but when I tried to build it I got this error.
sudo dkms install -m $DRV_NAME -v $DRV_VERSION 'make' KVER=4.4.38-v7+....(bad exit status: 2) ERROR (dkms apport): binary package for rtl8812AU: 4.3.20 not found Error! Bad return status for module build on kernel: 4.4.38-v7+ (armv7l) Consult /var/lib/dkms/rtl8812AU/4.3.20/build/make.log for more information.
There was also a message in the make.log
file but I didn't remember to copy it.
The solution was in this StackOverflow post - the make
program is being pointed to a folder named arm7l
(thats 'arm seven ell') but it should actually be pointed to one named arm
. The simple solution is to create an alias with the correct name.
sudo ln -s /usr/src/linux-headers-4.4.38-v7+/arch/arm/ /usr/src/linux-headers-4.4.38-v7+/arch/armv7l
This turns out to fix the build problem and after a reboot the network interface showed up.
The Raspberry Pi 3 doesn't support monitor mode for its wireless interface out of the box, and while there is a firmware patch to enable it, I chose to use a Realtek RTL 8812AU USB WiFi adapter instead. You need a little bit of extra work to get it going, but it does seem to work. One thing I noticed is that iwconfig
will put it in monitor mode but airmon-ng
doesn't (I haven't figured out why yet). It doesn't report an error, it just doesn't seem to work. Also, iw
always reports the interface as managed, even when it isn't… maybe I'll try the firmware patch after all.
[R1] Veselý V. Extended Comparison Study on Merging PCAP Files. ElectroScope. 2012 Oct 1;2012:1–6.
This paper talks about the three main packet-capture file-formats (LibPCAP, PCAPng, and NetMon) and some of the programs used to merge the different formats.
[R2] Dulaunoy A. Packet Capture, Filtering and Analysis - Today’s Challenges with 20 Years Old Issues. :38.
This is a presentation for a security conference. It has a list of common packet capture tools as well as ways to attack some of them while they are capturing packets.
Gulp purports to be better at capturing packets than tcpdump (although they can work together).
There is more than one version out there:
- This one says it applied a patch to it five years ago.
More easily obtainable and better documentation available (although still not enough).
Captures packets and decodes SSL/TLS packets.
This adds indexing to bgzip compressed LibPCAP files which then lets you extract them while the original files are still compressed.
This lets you extract part of or combine files created by tcpdump when using file rotation.
Describes itself as like GNU grep but for packets.
These are installed when you install wireshark.
Reorders the packets by timestamp.
This prints summary information about packe files (works with gzipped files).
Merges multiple packet files together. Mergecap will try to keep timestamps in order when merging, but it assumes each individual file to merge is already in order.
Track, reassemble, reorder TCP streams.
Gives connection information taken from a capture file.
Separates out TCP flows into separate files.
This is documented on the Flask site, but I was trying to help someone debug some old server code that I'd written and couldn't remember how to debug it, so I'm documenting it here as I go through remembering it again so I'll have a single reference to use the next time. Some of the settings look different from what I remember using so I think that Flask has changed a little over time, but since I didn't document it the first time I don't have a record to compare against (well, I probably have some notes in a notebook but it's hard to refer to that when sitting at someone else's desk).
The Flask Quickstart tells you what to do, but for some reason when we googled it, the instructions were different, I think it might have lead us to an older form of the documentation. This is the current version (May 20, 2018.)
First you have to tell flask which file contains your flask app by setting the FLASK_APP
environment variable. In my case I'm using connexion, an oddly-named adapter that adds support for swagger/OpenApi to Flask. So the file that has the app has this line in it.
application = connexion.FlaskApp(__name__)
In this case that's a file named api.py
which we'll say is in the server
folder (it isn't, but that's okay) so we need to set our environment accordingly. I use the fish shell so the syntax is slightly different from the Quick Start example. Also - and this caused me a lot of trouble - when I didn't pass in the name of my FlaskApp
instance I got this error:
Error: Failed to find application in module "server.api". Are you sure it contains a Flask application? Maybe you wrapped it in a WSGI middleware or you are using a factory function.
So I had to specifically tell flask the name of my app by appending it to the end of the setting (perhaps it is looking for app
specifically, but I called mine application
).
set -x FLASK_APP server.api:application
If you want the server to automatically reload when it detects changes then you should to set the FLASK_ENV
environment variable to development
. This is similar to using FLASK_DEBUG
but I think it adds the reloading. Anyway, it does more than just set debug mode.
set -x FLASK_ENV development
This is the output of the help string for the development server, note that it uses -h
for host
so you need to pass in --help
to see this output or you will get an error.
flask run --help
The default server runs on localhost, but since I'm hosting the code on a raspberry pi sitting on the network somewhere but debugging it remotely, I need to run it on a public address.
flask run --host=0.0.0.0
These are notes I made while surfing the web looking into TCP Dump. You will most likely need to use sudo
to run most of the commands, but I'm leaving it off to make it shorter.
You can ask tcpdump
which interfaces it is able to listen to2.
tcpdump -D
To capture packets on an interface you pass its name to the -i
flag2 (here the interface I'll use is eno1
).
tcpdump -i eno1
The default behavior is for tcpdump
to send the output to standard output, to have it save the packets to a files use the -w
flag2 (you can call it anything, I'll call it dump.pcap
).
tcpdump -i eno1 -w dump.pcap
To increase the amount if information that's captured, pass multiple v
arguments2 (in this case I'll use -vvv
).
tcpdump -i eno1 -vvv -w dump.pcap
You can get all the packets being sent or received by a host using the host
argument3.
tcpdump host 192.168.1.12
To filter out all the packets except those that are going to a specific target use the dst host
argument2.
tcpdump -i eno1 dst host 192.168.1.1
You can combine parameters using the logical operators and
, or
, and not
3.
tcpdump 'src 192.168.1.1 and dst 192.168.1.12'
The single quotes are optional and are just used to group the arguments together.
If you want to only catch activity on a certain port and by a certain protocol then you use the port
argument and the name of the protocol (e.g. udp
)3. This would catch all the tcp
traffic over SSH.
tcpdump tcp port 22
You can use tcp
, udp
, or icmp
for the protocols and add multiple ports using a comma4.
tcpdump tcp port 22,80
The default behavior for tcpdump
is to translate the hostnames and ports to something human-readable if possible. To turn this off you pass in the -n
argument3. Since this stops having to look things up it will reduce the amount of overhead needed by tcpdump
.
tcpdump -n -i eno1 port 22
Diogenes, Y. & Ozkaya, E. (2018). Cybersecurity, Attack and Defense Strategies : infrastructure security with Red Team and Blue Team tactics. Birmingham, UK: Packt Publishing.]
Johansen, G. (2017). Digital forensics and incident response : an intelligent way to respond to attacks. Birmingham, UK: Packt Publishing.
Beltrame, J. (2017). Penetration testing bootcamp : quickly get up and running with pentesting techniques. Birmingham, UK: Packt Publishing.
McPhee. & Beltrame, J. (2016). Penetration testing with Raspberry Pi : learn the art of building a low-cost, portable hacking arsenal using Raspberry Pi 3 and Kali Linux 2. Birmingham, UK: Packt Publishing.
Baxter, J., Orzach, Y. & Mishra, C. (2017). Wireshark revealed : essential skills for IT professionals : get up and running with Wireshark to analyze your network effectively. Birmingham, UK: Packt Publishing.
These are the parts that will make up the model.
The Keras Sequential Model is a stack of layers that will make up the neural network.
from keras.models import Sequential
The Keras Dense layer is a densely-connected layer within our neural network.
from keras.layers.core import Activation
The Activation represents the activation function for each layer (e.g. relu).
from keras.layers.core import Activation
To tune the model to the data we'll use the Adam optimizer
from keras.optimizers import Adam
Finally, since our problem is a classification problem (identify which of 10 digits an image represents) I'll import the Keras to_categorical
function to enable classification of our data.
from keras.utils import np_utils
The MNIST dataset is made up of human-classified hand-written digits. Keras includes it as part of their installation so we can load it directly from keras.
from keras.datasets import mnist
We're going to use numpy to reshape the data.
import numpy
To make our output the same every time, I'll set the random seed to April 28, 2018 as a string of digits.
numpy.random.seed(4282018)
This is an illustration of how to use bokeh with org-mode in nikola. There is a more extensive and flexible explanation of how to do this in this post on cherian.net but I made these notes to understand how it works and to have a simpler example to refer to.
I was interested in doing this because I'm trying to re-work some of what I did for the Coursera Data Science With Python specialization by changing the data-sets and building them as blog posts. I might convert the posts to restructured text at some point, but while I'm working with them I'm using org-mode. Also, while most of the time I use matplotlib for plotting since this is going to be a blog-first approach I decided to go with bokeh. I had previously written about how to get bokeh into Nikola using restructured text, but as an intermediate step I want to do the work in org-mode and still be able to see the output as I'm working.
The magic mix for this seems to be to use:
These are the dependencies. It's really all bokeh
, numpy
is just there to generate the data-values.
# from pypi
from bokeh.models import HoverTool
from bokeh.plotting import figure, ColumnDataSource
from bokeh.embed import autoload_static, file_html
import bokeh.resources
import numpy
I probably should save bokeh to this repository to keep the post from breaking in the future, but I'm lazy so I'm just going to import it from a CDN.
bokeh = bokeh.resources.CDN
To get a simple example going I'm just going to use some random outputs generated by numpy.
X = numpy.arange(10)
Y = numpy.random.randint(0, 10, 10)
In order to create a data-structure that bokeh can use (similar to a pandas dataframe) you need to use a ColumnDataSource.
source = ColumnDataSource(data=dict(
x=X,
y=Y,
desc=["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"],
))
The keys in the data-dict are essentially the column headers.
Key | Description |
---|---|
x | the x-axis values |
y | the y-axis values |
desc | The labels for the tooltip |
Now to get some tool-tips to pop up when you hover over the plot, I'll create a HoverTool
.
hover = HoverTool(tooltips=[
('index', '$index'),
('(x, y)', '($x, $y)'),
('desc', '@desc'),
])
The tooltips
list maps the labels that will show up in the tooltip (the first argument to each tuple) to variables in the ColumnDataSource
(if preceded by an @
) or generated value (if preceded by a $
) The index
value is the index in the array where the data point sits (so for the first point it will be 0, the second will be 1, etc.). The (x, y)
values are the coordinate locations of your pointer when you hover over the data points, and the desc
will be replaced by the label I set in the ColumnDataSource
for a particular data-point.
Now I'll create the actual plot (figure
).
fig = figure(title="Random Example", x_axis_label="x", y_axis_label="y",
tools=[hover])
fig.line('x', 'y', source=source)
fig.circle('x', 'y', size=10, source=source)
Finally I'll save the javascript and HTML files needed and then output the blob needed to embed the plot into this post. The autoload_static
function takes the bokeh plot object (fig
), the bokeh javascript that I loaded earlier (bokeh
), and the name of the javascript file that you want it to creat (test.js
) and returns the javascript to save (javascript
) and the HTML fragment that will include the javascript (source
). Note that because of the way nikola structures things I have to create a folder named files/posts/bokeh-org-mode
and save the files there. Since nikola will automatically look in this folder the name you pass into autoload_static
should just be the filename without the path, but when you save the javascript file you will save it there so you need to add the relative path. If my explanation seems a little convoluted, just look at the code below, it's fairly simple.
First I'll create a variable to hold the path to the folder to save the files in. All files for nikola posts go into sub-folders of files/posts/
and since the source file for this post is called bokeh-org-mode.org
, the files to include in it go into the folder files/posts/bokeh-org-mode
(files/posts/
plus the slug for the post).
FOLDER_PATH = "../files/posts/bokeh-org-mode/"
Now, I'll create the javascript source for the plot.
FILE_NAME = "test.js"
javascript, source = autoload_static(fig, bokeh, FILE_NAME)
with open(FOLDER_PATH + FILE_NAME, "w") as writer:
writer.write(javascript)
The javascript
variable holds the actual javascript source code (which then gets saved) while the source
variable holds the string with the HTML to embed the javascript into this post (which I show at the end of this post).
Finally, we need to print out the string that is stored in the source
variable which then tells org-mode to embed the files into this post. I'll output the full org-block so you can see the header arguments.
#+BEGIN_SRC python :session bokeh :results output raw :exports results
print('''#+BEGIN_EXPORT html
{}
#+END_EXPORT'''.format(source))
#+END_SRC
And there you have it. I don't have a lot to say about it, other than that if you hover over the data with your cursor and then look up above at the ColumnDataSource
above, you'll see that the variables match the inputs
To get a bokeh figure into an org-mode document in nikola:
files/posts/
folder that matches the slug for the post.autoload_static
to convert the bokeh object to javascript and create the HTML tag to embed it.files/posts/<slug>/
folder that you created#+BEGIN_EXPORT html
block.