Tutorial 5. Composing Plots

So far we have generated plots using hvPlot, but we haven’t discussed what exactly these plots are and how they differ from the output of other libraries offering the .plot API. It turns out that as in the previous pn.interact example, the .hvplot() output is actually a rich, compositional object that can be used in many different ways, not just as an immediate plot. Specifically, hvPlot generates HoloViews objects rendered using the bokeh backend. In the previous notebook we saw that these objects are rendered as interactive Bokeh plots that support hovering, panning, and zooming.

In this notebook, we’ll examine the output of hvPlot calls to take a look at individual HoloViews objects. Then we will see how these “elements” offer us powerful ways of combining and composing layered visualizations.

Read in the data#

We’ll read in the data as before, and also reindex by time so that we can do more easily do resampling.

import hvplot.dask  # noqa: adds hvplot method to dask objects
import hvplot.pandas  # noqa

from load_data import *

df = load_data()
df = df.sample(frac=0.01)
df.time = df.time.astype("datetime64[ns]")
df.head()

cleaned_df = df.copy()
cleaned_df["wspd"] = df.wspd.where(df.wspd > 0)
cleaned_reindexed_df = cleaned_df.set_index(cleaned_df.time)

Composing plots#

In this section we’ll start looking at how we can group plots to gain a deeper understanding of the data. We’ll start by resampling the data to explore patterns in magnitude of waves over time.

weekly_count = (
    cleaned_reindexed_df.station.resample("1W").count().rename("count")
)
weekly_count_plot = weekly_count.hvplot(title="Count of waves by week")

The first thing to note is that with hvplot, it is common to grab a handle on the returned output. Unlike the matplotlib based .plot API of pandas, where an axis object is returned (with plotting display occuring as a side-effect if matplotlib inline is loaded), grabbing the output of hvplot has no side-effects at all (as would be true for typical Python objects as well).

When working with the HoloViews object returned by hvplot, plotting only occurs when we look at the object itself:

weekly_count_plot

Now we have a handle on this object, we can look at its textual representation by printing it:

print(weekly_count_plot)

This is HoloViews notation for saying that the plot is a Curve element with time as the key dimension (kdim) and count as the value dimension (vdim).

weekly_mean_wspd = cleaned_reindexed_df.wspd.resample("1W").mean()
weekly_mean_wspd_plot = weekly_mean_wspd.hvplot(title="Weekly Mean Wind Speed")
weekly_mean_wspd_plot

print(weekly_mean_wspd_plot)

This plot has time on the x axis like the other, but the value dimension is magnitude rather than count. Holoviews supports composing plots from individual elements using the + symbol to see them side-by-side with linked axes for any shared dimensions:

(weekly_mean_wspd_plot + weekly_count_plot).cols(1)

Try zooming in and out to explore the linking between the plots above.

Interestingly, there are three clear peaks in the monthy counts, and two of them correspond to sudden dips in the mean magnitude, while the third corresponds to a peak in the mean magnitude.

Exercise#

Use tab completion to explore weekly_count_plot.

Hint

Try accessing .data:

Adding a third dimension#

Now let’s filter the waves to only include the really gusty ones. We can add extra dimensions to the visualization by using color in addition to x and y.

import hvplot.pandas

Here is how you can color by wave height using pandas .plot API:

most_severe = df[df.gst >= 10]
%matplotlib inline
most_severe.plot.scatter(x='longitude', y='latitude', c='gst')

Here is the analogous version using hvplot where we grab the handle high_wspd_scatter so we can inspect the return value:

high_wspd_scatter = most_severe.hvplot.scatter(
    x="longitude", y="latitude", c="gst"
)
high_wspd_scatter

As always, this return value is actually a HoloViews element which has a printed representation:

print(high_wspd_scatter)

As mentioned earlier, the notion of a ‘scatter’ plot implies that there is an independent variable and at least one dependent variable. This is reflected in the printed representation where the independent variables are in the square brackets and the dependent ones are in parentheses - we can now see that this scatter object implies that latitude is dependent on longitude, which is incorrect. We will learn more about HoloViews objects in the next notebook, and we’ll fix the dimensions below.

But first, let’s adjust the options to create a better plot. First we’ll use colorcet to get a colormap that doesn’t have white at one end, to avoid ambiguity with the page background. We can choose one from the website and use the HoloViews/Bokeh-based colorcet plotting module to make sure it looks good.

import colorcet as cc

from colorcet.plotting import swatch

swatch("CET_L4")

We’ll reverse the colors to align dark reds with gustier waves.

wspd_cmap = cc.CET_L4[::-1]

In addition to fixing the colormap, we will now switch from scatter to using points to correctly reflect that longitude and latitude are independent variables, as well as add some additional columns to the hover text, and add a title.

gusty_points = most_severe.hvplot.points(
    x="longitude",
    y="latitude",
    c="gst",
    hover_cols=["place", "time"],
    cmap=wspd_cmap,
    title="Wave Heights with gusts >= 10",
)

gusty_points

When you hover over the points you’ll see the place and time of the waves in addition to the wind speed and lat/lon. This is reflected in the dimensions that HoloViews is keeping track of:

print(gusty_points)

Exercise#

Compare this Points printed representation to the Scatter printed representation and note the differences in how the dimensions are grouped together.

Use the colorcet plotting module swatches(group='linear') to choose a different colormap.

Hint

from colorcet.plotting import swatches
swatches(group='linear')

Overlay with a tiled map#

That colormap is better, and we can kind of see the outlines of the continents, but the visualization would be much easier to parse if we added a base map underneath. To do this, we’ll import a tile element from HoloViews, namely the OSM tile from openstreetmap using the Web Mercator projection:

from holoviews.element.tiles import OSM

OSM()

Note that when you zoom the map becomes more and more detailed, downloading tiles as necessary. In order to overlay on this basemap, we need to project our waves to the Web Mercator projection system.

import numpy as np
import pandas as pd

from datashader.utils import lnglat_to_meters

To do this we will use the lnglat_to_meters function in the datashader.geo module to map longitude and latitude to easting and northing respectively:

x, y = lnglat_to_meters(most_severe.longitude, most_severe.latitude)
most_severe_projected = most_severe.join(
    [pd.DataFrame({"easting": x}), pd.DataFrame({"northing": y})]
)

We can now overlay our points on top of the OSM tile source but instead of overlaying the tile source explicitly we can also just specify tiles='OSM' as a string:

most_severe_projected.hvplot.points(
    x="easting",
    y="northing",
    c="wspd",
    hover_cols=["place", "time"],
    cmap=wspd_cmap,
    title="Waves with gusts >= 10",
    tiles="OSM",
    line_color="black",
)

Note that the Web Mercator projection is only one of many possible projections used when working with geospatial data. If you need to work with these different projections, you can use the GeoViews extension to HoloViews that makes elements aware of the projection they are defined in and automatically projects into whatever coordinates are needed for display.

Exercise#

Import and use different tiles.

Hint

EsriImagery or Wikipedia.