HoloViews Gridded Data
Table of Contents
The Beginning
This is a reproduction of the HoloViews Gridded Datasets example in the Getting Started section of their documentation to make sure that I can do it myself. The Gridded Data that they're referring to is data with more than two dimensions (a table has two dimensions - columns and rows). One way to display this type of data might be to use three dimensions (assuming there's only three), but the suggestion here is to use a slider to step through the data instead (so the third dimension is time or something similar).
The Middle
Set Up
Imports
The Dotenv
load_dotenv(".env")
Bokeh
#+begin_src python :session holoviews :results none
holoviews.extension("bokeh")
The Embedder
files_path = Path("../../files/posts/libraries/holoviews-gridded-data/")
Embed = partial(
EmbedBokeh,
folder_path=files_path)
The Output Path
This is where to store files and images that get created and need to get grabbed by nikola.
OUTPUT_PATH = Path("../../files/posts/libraries/holoviews-gridded-data")
assert OUTPUT_PATH.is_dir()
The Data
data_path = Path(os.environ.get("PHOTONS")).expanduser()
assert data_path.is_file()
data = numpy.load(str(data_path))
calcium = data["Calcium"]
print(calcium.shape)
(62, 111, 50)
The numpy.load function loads a pickled dict-like object that doesn't unpickle the data until you retrieve it (e.g. when I created the calcium
variable). The data is a 3D array of images with the third dimension representing time (it's a set of images that change over time) so there are 50 time-steps (I don't know what the imagaes actually are or what the time-intervals are).
I'm going to load the data into a HoloViews Dataset, but is going to need time, x, and y coordinates to render the images. To make it simpler I'll just pass in integer arrays with zero-indexed indices for the images. Since this isn't really a dataset that I care about I don't know what the orientation of the image is, but the example set it up as a wide image.
data_set = holoviews.Dataset((numpy.arange(50), numpy.arange(111), numpy.arange(62), calcium),
["Time", "x", "y"], "Fluoresence")
print(data_set)
:Dataset [Time,x,y] (Fluoresence)
Note that holoviews treats labels in lists differently than tuples - if it's a tuple then it thinks it represents a (<variable>, <label>)
pair and a list is treated as just labels, which is what we want here.
The example says that really we should have used xarray to load the data, so this is how to convert it to and xarray
.
print(data_set.clone(datatype=["xarray"]).data)
<xarray.Dataset> Dimensions: (Time: 50, x: 111, y: 62) Coordinates: * Time (Time) int64 0 1 2 3 4 5 6 7 8 9 ... 41 42 43 44 45 46 47 48 49 * x (x) int64 0 1 2 3 4 5 6 7 8 ... 103 104 105 106 107 108 109 110 * y (y) int64 0 1 2 3 4 5 6 7 8 9 ... 52 53 54 55 56 57 58 59 60 61 Data variables: Fluoresence (y, x, Time) uint16 386 441 196 318 525 ... 806 801 899 583 774
But we want to use a Dataset, not an x-array so this was just for illustration…
Build It
Set Up the Plot
Before doing the plot I'll set the defaults for it.
opts.defaults(
opts.GridSpace(shared_xaxis=True, shared_yaxis=True),
opts.Image(cmap="viridis", width=400, height=400),
opts.Labels(text_color="white", text_font_size="8pt", text_align="left", text_baseline="bottom"),
opts.Path(color="white"),
opts.Spread(width=600, tools=["hover"]),
opts.Overlay(show_legend=False)
)
Note that if you don't setup the backend with holoviews.extension
then the opts won't have any of the attributes like GridSpace
, Image
, etc.
plot = data_set.to(holoviews.Image, ["x", "y"]).hist()
file_name = "grid_image.html"
output = OUTPUT_PATH.joinpath(file_name)
holoviews.save(plot, output)
print("[[file:{}][Link to the plot.]]".format(file_name))
Zoom In
HoloViews provides a way to select out Regions of Interest (ROI). The pickle we loaded earlier has coordinates for rectangular bounding boxes in it (under the ROIs
key).
regions_of_interest = data["ROIs"]
bounds = holoviews.Path([holoviews.Bounds(tuple(region)) for region in regions_of_interest])
print(regions_of_interest.shape)
(147, 4)
labels = holoviews.Labels([(roi[0], roi[1], i) for i, roi in enumerate(regions_of_interest)])
plot = (data_set[21].to(holoviews.Image, ['x', 'y']) * bounds * labels).relabel('Time: 21')
file_name = "bounds.html"
output = OUTPUT_PATH.joinpath(file_name)
Embed(plot, file_name)()
Select the Facet
x0, y0, x1, y1 = regions_of_interest[60]
roi = data_set.select(x=(x0, x1), y=(y0, y1), time=(250, 280)).relabel('ROI #60')
plot = roi.to(holoviews.Image, ['x', 'y'])
file_name = "selection.html"
output = OUTPUT_PATH.joinpath(file_name)
holoviews.save(plot, output)
print("[[file:{}][link]]".format(file_name))
Mean and Standard Deviation
agg = roi.aggregate('Time', numpy.mean, spreadfn=numpy.std)
plot = holoviews.Spread(agg) * holoviews.Curve(agg)
file_name = "spread.html"
output = OUTPUT_PATH.joinpath(file_name)
holoviews.save(plot, output)
print("[[file:{}][Link]]".format(file_name))