Tutorial 5. Composing Plots
So far we have generated plots using hvPlot, but we
haven’t discussed what exactly these plots are and how they differ from the
output of other libraries offering the .plot
API. It turns out that as in the
previous pn.interact
example, the .hvplot()
output is actually a rich,
compositional object that can be used in many different ways, not just as an
immediate plot. Specifically, hvPlot generates
HoloViews objects rendered using the
bokeh backend. In the previous notebook we saw that these
objects are rendered as interactive Bokeh plots that support hovering, panning,
and zooming.
In this notebook, we’ll examine the output of hvPlot calls to take a look at individual HoloViews objects. Then we will see how these “elements” offer us powerful ways of combining and composing layered visualizations.
Read in the data#
We’ll read in the data as before, and also reindex by time so that we can do more easily do resampling.
import hvplot.dask # noqa: adds hvplot method to dask objects
import hvplot.pandas # noqa
from load_data import *
df = load_data()
df = df.sample(frac=0.01)
df.time = df.time.astype("datetime64[ns]")
df.head()
cleaned_df = df.copy()
cleaned_df["wspd"] = df.wspd.where(df.wspd > 0)
cleaned_reindexed_df = cleaned_df.set_index(cleaned_df.time)
Composing plots#
In this section we’ll start looking at how we can group plots to gain a deeper understanding of the data. We’ll start by resampling the data to explore patterns in magnitude of waves over time.
weekly_count = (
cleaned_reindexed_df.station.resample("1W").count().rename("count")
)
weekly_count_plot = weekly_count.hvplot(title="Count of waves by week")
The first thing to note is that with hvplot
, it is common to grab a handle on
the returned output. Unlike the matplotlib based .plot
API of pandas, where an
axis object is returned (with plotting display occuring as a side-effect if
matplotlib inline is loaded), grabbing the output of hvplot
has no
side-effects at all (as would be true for typical Python objects as well).
When working with the HoloViews object returned by hvplot
, plotting only
occurs when we look at the object itself:
weekly_count_plot
Now we have a handle on this object, we can look at its textual representation by printing it:
print(weekly_count_plot)
This is HoloViews notation for saying that the plot is a Curve
element with
time
as the key dimension (kdim
) and count
as the value dimension
(vdim
).
weekly_mean_wspd = cleaned_reindexed_df.wspd.resample("1W").mean()
weekly_mean_wspd_plot = weekly_mean_wspd.hvplot(title="Weekly Mean Wind Speed")
weekly_mean_wspd_plot
print(weekly_mean_wspd_plot)
This plot has time on the x axis like the other, but the value dimension is
magnitude rather than count. Holoviews supports composing plots from individual
elements using the +
symbol to see them side-by-side with linked axes for any
shared dimensions:
(weekly_mean_wspd_plot + weekly_count_plot).cols(1)
Try zooming in and out to explore the linking between the plots above.
Interestingly, there are three clear peaks in the monthy counts, and two of them correspond to sudden dips in the mean magnitude, while the third corresponds to a peak in the mean magnitude.
Exercise#
Use tab completion to explore weekly_count_plot
.
Hint
Try accessing .data:
Adding a third dimension#
Now let’s filter the waves to only include the really gusty ones. We can add extra dimensions to the visualization by using color in addition to x and y.
import hvplot.pandas
Here is how you can color by wave height using pandas .plot
API:
most_severe = df[df.gst >= 10]
%matplotlib inline
most_severe.plot.scatter(x='longitude', y='latitude', c='gst')
Here is the analogous version using hvplot
where we grab the handle
high_wspd_scatter
so we can inspect the return value:
high_wspd_scatter = most_severe.hvplot.scatter(
x="longitude", y="latitude", c="gst"
)
high_wspd_scatter
As always, this return value is actually a HoloViews element which has a printed representation:
print(high_wspd_scatter)
As mentioned earlier, the notion of a ‘scatter’ plot implies that there is an independent variable and at least one dependent variable. This is reflected in the printed representation where the independent variables are in the square brackets and the dependent ones are in parentheses - we can now see that this scatter object implies that latitude is dependent on longitude, which is incorrect. We will learn more about HoloViews objects in the next notebook, and we’ll fix the dimensions below.
But first, let’s adjust the options to create a better plot. First we’ll use colorcet to get a colormap that doesn’t have white at one end, to avoid ambiguity with the page background. We can choose one from the website and use the HoloViews/Bokeh-based colorcet plotting module to make sure it looks good.
import colorcet as cc
from colorcet.plotting import swatch
swatch("CET_L4")
We’ll reverse the colors to align dark reds with gustier waves.
wspd_cmap = cc.CET_L4[::-1]
In addition to fixing the colormap, we will now switch from scatter
to using
points
to correctly reflect that longitude and latitude are independent
variables, as well as add some additional columns to the hover text, and add a
title.
gusty_points = most_severe.hvplot.points(
x="longitude",
y="latitude",
c="gst",
hover_cols=["place", "time"],
cmap=wspd_cmap,
title="Wave Heights with gusts >= 10",
)
gusty_points
When you hover over the points you’ll see the place and time of the waves in addition to the wind speed and lat/lon. This is reflected in the dimensions that HoloViews is keeping track of:
print(gusty_points)
Exercise#
Compare this Points
printed representation to the Scatter
printed
representation and note the differences in how the dimensions are grouped
together.
Use the colorcet plotting module swatches(group='linear')
to choose a
different colormap.
Hint
from colorcet.plotting import swatches
swatches(group='linear')
Overlay with a tiled map#
That colormap is better, and we can kind of see the outlines of the continents,
but the visualization would be much easier to parse if we added a base map
underneath. To do this, we’ll import a tile element from HoloViews, namely the
OSM
tile from openstreetmap using the Web
Mercator projection:
from holoviews.element.tiles import OSM
OSM()
Note that when you zoom the map becomes more and more detailed, downloading tiles as necessary. In order to overlay on this basemap, we need to project our waves to the Web Mercator projection system.
import numpy as np
import pandas as pd
from datashader.utils import lnglat_to_meters
To do this we will use the lnglat_to_meters
function in the datashader.geo
module to map longitude
and latitude
to easting
and northing
respectively:
x, y = lnglat_to_meters(most_severe.longitude, most_severe.latitude)
most_severe_projected = most_severe.join(
[pd.DataFrame({"easting": x}), pd.DataFrame({"northing": y})]
)
We can now overlay our points on top of the OSM
tile source but instead of
overlaying the tile source explicitly we can also just specify tiles='OSM'
as
a string:
most_severe_projected.hvplot.points(
x="easting",
y="northing",
c="wspd",
hover_cols=["place", "time"],
cmap=wspd_cmap,
title="Waves with gusts >= 10",
tiles="OSM",
line_color="black",
)
Note that the Web Mercator projection is only one of many possible projections used when working with geospatial data. If you need to work with these different projections, you can use the GeoViews extension to HoloViews that makes elements aware of the projection they are defined in and automatically projects into whatever coordinates are needed for display.
Exercise#
Import and use different tiles.
Hint
EsriImagery or Wikipedia.