Plotly graphics

Plotly graphics#

One of the major players in interactive graphs is Plotly.
Some alternatives are Bokeh and Altair.
Interfacing it comes in two main flavours:
- graph_objects: low-level graphics handling
- plotly.express: high-level graphics handling
In addition plotly is integrated in the dash environment with its dialect.
Figures are dictionaries, which we will leverage.

# The following renders plotly graphs in Jupyter Notebook, Jupyter Lab and VS Code formats
import plotly.io as pio
pio.renderers.default = "notebook+plotly_mimetype"

Plotting with AI assistance#

Many plot commands can be obtained by describing plots to AIs.
AIs can also translate from one plotting framework to another.
Sketching a set of plot and adding sufficient descriptions, may result in usable code.

Basic plotting#

# Gapminder dataset of health and wealth stats for different countries
import plotly.express as px
df = px.data.gapminder()
df.head()

	country	continent	year	lifeExp	pop	gdpPercap	iso_alpha	iso_num
0	Afghanistan	Asia	1952	28.801	8425333	779.445314	AFG	4
1	Afghanistan	Asia	1957	30.332	9240934	820.853030	AFG	4
2	Afghanistan	Asia	1962	31.997	10267083	853.100710	AFG	4
3	Afghanistan	Asia	1967	34.020	11537966	836.197138	AFG	4
4	Afghanistan	Asia	1972	36.088	13079460	739.981106	AFG	4

Line plot#

# Create a line plot of life expectancy over time for Norway.
# Let the figure be 400 pixels high and 700 pixels wide.
# Set the title to 'Life Expectancy in Norway'.
# Set the x-axis label to 'Year'.
# Set the y-axis label to 'Life Expectancy (years)'.
fig = px.line(df[df['country'] == 'Norway'], x='year', y='lifeExp', title='Life Expectancy in Norway', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig

# Create a plot with one line for Norway and one line for Sweden in the same style as the plot above.
# Let the legend title be 'Country'.
fig = px.line(df[df['country'].isin(['Norway', 'Sweden'])], x='year', y='lifeExp', color='country', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')
fig

# The dictionary defining the figure
print(fig)

Figure({
    'data': [{'hovertemplate': 'country=Norway<br>year=%{x}<br>lifeExp=%{y}<extra></extra>',
              'legendgroup': 'Norway',
              'line': {'color': '#636efa', 'dash': 'solid'},
              'marker': {'symbol': 'circle'},
              'mode': 'lines',
              'name': 'Norway',
              'orientation': 'v',
              'showlegend': True,
              'type': 'scatter',
              'x': array([1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 2002, 2007]),
              'xaxis': 'x',
              'y': array([72.67 , 73.44 , 73.47 , 74.08 , 74.34 , 75.37 , 75.97 , 75.89 , 77.32 ,
                          78.32 , 79.05 , 80.196]),
              'yaxis': 'y'},
             {'hovertemplate': 'country=Sweden<br>year=%{x}<br>lifeExp=%{y}<extra></extra>',
              'legendgroup': 'Sweden',
              'line': {'color': '#EF553B', 'dash': 'solid'},
              'marker': {'symbol': 'circle'},
              'mode': 'lines',
              'name': 'Sweden',
              'orientation': 'v',
              'showlegend': True,
              'type': 'scatter',
              'x': array([1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 2002, 2007]),
              'xaxis': 'x',
              'y': array([71.86 , 72.49 , 73.37 , 74.16 , 74.72 , 75.44 , 76.42 , 77.19 , 78.16 ,
                          79.39 , 80.04 , 80.884]),
              'yaxis': 'y'}],
    'layout': {'height': 400,
               'legend': {'title': {'text': 'Country'}, 'tracegroupgap': 0},
               'margin': {'t': 60},
               'template': '...',
               'width': 700,
               'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'Year'}},
               'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'Life Expectancy (years)'}}}
})

print(fig['data'][0]['line']['color'])

#636efa

Directly editing the dictionary#

fig['data'][0]['line']['color'] = "#000000"

fig

Shaded areas#

The fill parameter can be used to fill to the next line, to zero or to itself if the series reverses.
The latter is most convenient for single colour background shading.

# Plot the mean life expectancy in Europe over time. Shade the area between minimum and maximum life expectancy in Europe over time.
# Overlay Norway's life expectancy over the plot.
# https://plotly.com/python/continuous-error-bars/
dfE = df[df['continent'] == 'Europe'][['year', 'lifeExp']].groupby('year')
dfEmean = dfE.mean().reset_index()
dfEmean['Legend'] = 'Average' # Hack to include line in legend, see color below.

fig = px.line(dfEmean, x='year', y='lifeExp', title='Life Expectancy in Europe', color='Legend', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')

# Fill between dfE.min().reset_index() and dfE.max().reset_index()
fig.add_scatter(x=dfE.min().reset_index()['year'], y=dfE.min().reset_index()['lifeExp'], name='Min', fill='tonexty')
fig.add_scatter(x=dfE.max().reset_index()['year'], y=dfE.max().reset_index()['lifeExp'], name='Max', fill='tonexty')
fig.add_scatter(x=df[df['country'] == 'Norway']['year'], y=df[df['country'] == 'Norway']['lifeExp'], name='Norway')
fig

Note

Look at the way .reset_index() is used to promote years back to a variable again.

Bar plot#

# Make a barplot of the life expectancy in Norway over time.
fig = px.bar(df[df['country'] == 'Norway'], x='year', y='lifeExp', title='Life Expectancy in Norway', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig

# Make a barplot with both Norway and Sweden in the same plot. Let the countries be side by side for each year.
fig = px.bar(df[df['country'].isin(['Norway', 'Sweden'])], x='year', y='lifeExp', color='country', barmode='group', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')
fig

Note

Remove “barmode” for stacking.

# Create a barplot with maximum life expectancy in Europe for each year.
# Overlay the life expectancy in Bulgaria over the plot with narrower bars using barmode='overlay'.
dfE = df[df['continent'] == 'Europe'][['year', 'lifeExp']].groupby('year')
dfEmax = dfE.max().reset_index()
dfEmax['Bulgaria'] = df[df['country'] == 'Bulgaria']['lifeExp'].reset_index()['lifeExp']
dfEmax.columns = ['year', 'Europe max', 'Bulgaria']
fig = px.bar(dfEmax, x='year', y=['Europe max', 'Bulgaria'], title='Life Expectancy in Europe', barmode='overlay', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')
fig

# Inspect the figure
print(fig)

Figure({
    'data': [{'alignmentgroup': 'True',
              'hovertemplate': 'variable=Europe max<br>year=%{x}<br>value=%{y}<extra></extra>',
              'legendgroup': 'Europe max',
              'marker': {'color': '#636efa', 'opacity': 0.5, 'pattern': {'shape': ''}},
              'name': 'Europe max',
              'offsetgroup': 'Europe max',
              'orientation': 'v',
              'showlegend': True,
              'textposition': 'auto',
              'type': 'bar',
              'x': array([1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 2002, 2007]),
              'xaxis': 'x',
              'y': array([72.67 , 73.47 , 73.68 , 74.16 , 74.72 , 76.11 , 76.99 , 77.41 , 78.77 ,
                          79.39 , 80.62 , 81.757]),
              'yaxis': 'y'},
             {'alignmentgroup': 'True',
              'hovertemplate': 'variable=Bulgaria<br>year=%{x}<br>value=%{y}<extra></extra>',
              'legendgroup': 'Bulgaria',
              'marker': {'color': '#EF553B', 'opacity': 0.5, 'pattern': {'shape': ''}},
              'name': 'Bulgaria',
              'offsetgroup': 'Bulgaria',
              'orientation': 'v',
              'showlegend': True,
              'textposition': 'auto',
              'type': 'bar',
              'x': array([1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 2002, 2007]),
              'xaxis': 'x',
              'y': array([59.6  , 66.61 , 69.51 , 70.42 , 70.9  , 70.81 , 71.08 , 71.34 , 71.19 ,
                          70.32 , 72.14 , 73.005]),
              'yaxis': 'y'}],
    'layout': {'barmode': 'overlay',
               'height': 400,
               'legend': {'title': {'text': 'Country'}, 'tracegroupgap': 0},
               'template': '...',
               'title': {'text': 'Life Expectancy in Europe'},
               'width': 700,
               'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'Year'}},
               'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'Life Expectancy (years)'}}}
})

# Adust the width of the Bulgaria bars to 2.
fig['data'][1]['width'] = 1.5
fig

Polar barplots#

The x-axis in barplots do not have to be straight.

angles = (dfEmax['year']-1952)/55*360*11/12
width = [360/12-5]*12
r = dfEmax['Europe max']

import plotly.graph_objects as go

fig = go.Figure(go.Barpolar(
    r=r,
    theta=angles,
    width=width,
    marker_color=dfEmax['Europe max'],
    marker_line_color="black",
    marker_line_width=2,
    opacity=0.8
))

fig.update_layout(
    template=None,
    polar = dict(
        radialaxis = dict(range=[0, 100], showticklabels=False, ticks=''),
        angularaxis = dict(showticklabels=False, ticks='')
    )
)

fig

# Change me to plotly express, please!

Scatter plot#

# Create a Plotly express scatter plot of the iris data
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color='species')
fig.update_xaxes(title='Sepal width')
fig.update_yaxes(title='Sepal length')
fig

# Inspect the scatter plot.
# Note three legendgroups and the markers. Many more options are available.
print(fig)

Figure({
    'data': [{'hovertemplate': 'species=setosa<br>sepal_width=%{x}<br>sepal_length=%{y}<extra></extra>',
              'legendgroup': 'setosa',
              'marker': {'color': '#636efa', 'symbol': 'circle'},
              'mode': 'markers',
              'name': 'setosa',
              'orientation': 'v',
              'showlegend': True,
              'type': 'scatter',
              'x': array([3.5, 3. , 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7, 3.4, 3. , 3. ,
                          4. , 4.4, 3.9, 3.5, 3.8, 3.8, 3.4, 3.7, 3.6, 3.3, 3.4, 3. , 3.4, 3.5,
                          3.4, 3.2, 3.1, 3.4, 4.1, 4.2, 3.1, 3.2, 3.5, 3.1, 3. , 3.4, 3.5, 2.3,
                          3.2, 3.5, 3.8, 3. , 3.8, 3.2, 3.7, 3.3]),
              'xaxis': 'x',
              'y': array([5.1, 4.9, 4.7, 4.6, 5. , 5.4, 4.6, 5. , 4.4, 4.9, 5.4, 4.8, 4.8, 4.3,
                          5.8, 5.7, 5.4, 5.1, 5.7, 5.1, 5.4, 5.1, 4.6, 5.1, 4.8, 5. , 5. , 5.2,
                          5.2, 4.7, 4.8, 5.4, 5.2, 5.5, 4.9, 5. , 5.5, 4.9, 4.4, 5.1, 5. , 4.5,
                          4.4, 5. , 5.1, 4.8, 5.1, 4.6, 5.3, 5. ]),
              'yaxis': 'y'},
             {'hovertemplate': 'species=versicolor<br>sepal_width=%{x}<br>sepal_length=%{y}<extra></extra>',
              'legendgroup': 'versicolor',
              'marker': {'color': '#EF553B', 'symbol': 'circle'},
              'mode': 'markers',
              'name': 'versicolor',
              'orientation': 'v',
              'showlegend': True,
              'type': 'scatter',
              'x': array([3.2, 3.2, 3.1, 2.3, 2.8, 2.8, 3.3, 2.4, 2.9, 2.7, 2. , 3. , 2.2, 2.9,
                          2.9, 3.1, 3. , 2.7, 2.2, 2.5, 3.2, 2.8, 2.5, 2.8, 2.9, 3. , 2.8, 3. ,
                          2.9, 2.6, 2.4, 2.4, 2.7, 2.7, 3. , 3.4, 3.1, 2.3, 3. , 2.5, 2.6, 3. ,
                          2.6, 2.3, 2.7, 3. , 2.9, 2.9, 2.5, 2.8]),
              'xaxis': 'x',
              'y': array([7. , 6.4, 6.9, 5.5, 6.5, 5.7, 6.3, 4.9, 6.6, 5.2, 5. , 5.9, 6. , 6.1,
                          5.6, 6.7, 5.6, 5.8, 6.2, 5.6, 5.9, 6.1, 6.3, 6.1, 6.4, 6.6, 6.8, 6.7,
                          6. , 5.7, 5.5, 5.5, 5.8, 6. , 5.4, 6. , 6.7, 6.3, 5.6, 5.5, 5.5, 6.1,
                          5.8, 5. , 5.6, 5.7, 5.7, 6.2, 5.1, 5.7]),
              'yaxis': 'y'},
             {'hovertemplate': 'species=virginica<br>sepal_width=%{x}<br>sepal_length=%{y}<extra></extra>',
              'legendgroup': 'virginica',
              'marker': {'color': '#00cc96', 'symbol': 'circle'},
              'mode': 'markers',
              'name': 'virginica',
              'orientation': 'v',
              'showlegend': True,
              'type': 'scatter',
              'x': array([3.3, 2.7, 3. , 2.9, 3. , 3. , 2.5, 2.9, 2.5, 3.6, 3.2, 2.7, 3. , 2.5,
                          2.8, 3.2, 3. , 3.8, 2.6, 2.2, 3.2, 2.8, 2.8, 2.7, 3.3, 3.2, 2.8, 3. ,
                          2.8, 3. , 2.8, 3.8, 2.8, 2.8, 2.6, 3. , 3.4, 3.1, 3. , 3.1, 3.1, 3.1,
                          2.7, 3.2, 3.3, 3. , 2.5, 3. , 3.4, 3. ]),
              'xaxis': 'x',
              'y': array([6.3, 5.8, 7.1, 6.3, 6.5, 7.6, 4.9, 7.3, 6.7, 7.2, 6.5, 6.4, 6.8, 5.7,
                          5.8, 6.4, 6.5, 7.7, 7.7, 6. , 6.9, 5.6, 7.7, 6.3, 6.7, 7.2, 6.2, 6.1,
                          6.4, 7.2, 7.4, 7.9, 6.4, 6.3, 6.1, 7.7, 6.3, 6.4, 6. , 6.9, 6.7, 6.9,
                          5.8, 6.8, 6.7, 6.7, 6.3, 6.5, 6.2, 5.9]),
              'yaxis': 'y'}],
    'layout': {'legend': {'title': {'text': 'species'}, 'tracegroupgap': 0},
               'margin': {'t': 60},
               'template': '...',
               'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'Sepal width'}},
               'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'Sepal length'}}}
})

# Manipulate symbols
fig = px.scatter(df, x="sepal_width", y="sepal_length", 
                 color='species', size="petal_width")
fig.update_xaxes(title='Sepal width')
fig.update_yaxes(title='Sepal length')
fig

Boxplots and violin plots#

# Make a boxplot of the life expectancy per country in Europe
df = px.data.gapminder()
dfE = df[df['continent'] == 'Europe']
fig = px.box(dfE, x='country', y='lifeExp', title='Life Expectancy in Europe', width=800, height=500)
fig.update_xaxes(title='Country')
fig.update_yaxes(title='Life Expectancy (years)')
fig

# Make a violinplot of the life expectancy per country in Europe 
# with the same style as the boxplot above.
fig = px.violin(dfE, x='country', y='lifeExp', title='Life Expectancy in Europe', width=800, height=400)
fig.update_xaxes(title='Country')
fig.update_yaxes(title='Life Expectancy (years)')
fig

Marginal plots#

Scatter plots support simple marginal plots, e.g., histograms and similar.

# Add a marginal violin plot to the scatter plot.
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", 
                 color='species', size="petal_width", marginal_y='box')
fig.update_xaxes(title='Sepal width')
fig.update_yaxes(title='Sepal length')
fig

Exercise#

Test other marginal plot types and locations.

Heatmap#

# Make a correlation heatmap of the iris data
df = px.data.iris()
fig = px.imshow(df.corr(numeric_only=True))
fig

Tables#

One can plot tables with styling.

# Make a Plotly express table view for the iris data
# https://plotly.com/python/table
import plotly.graph_objects as go
df = px.data.iris()
fig = go.Figure(data=[go.Table(
    header=dict(values=list(df.columns),
                fill_color='paleturquoise',
                align='left'),
    cells=dict(values=[df.sepal_length, df.sepal_width, df.petal_length, df.petal_width, df.species, df.species_id],
               fill_color='lavender',
               align='left'))
])

fig

Layouts#

For Plotly express there is no direct layout option, except for facets (see below).
Instead one need to go to the low-level graph objects.

# Make a two by two plotly express plot with two scatter plots and two pie charts, all four with random data
# https://plotly.com/python/subplots/
import plotly.graph_objects as go
import numpy as np
from plotly.subplots import make_subplots
np.random.seed(1)
# Initialize figure with subplots with type of plot in each cell
fig = make_subplots(rows=2, cols=2, 
                    specs=[[{"type": "xy"}, {"type": "xy"}], 
                           [{"type": "domain"}, {"type": "domain"}]])
fig.add_trace(go.Scatter(x=np.random.rand(100), y=np.random.rand(100), mode='markers'), row=1, col=1)
fig.add_trace(go.Scatter(x=np.random.rand(100), y=np.random.rand(100), mode='markers'), row=1, col=2)
fig.add_trace(go.Pie(values=np.random.rand(3)), row=2, col=1)
fig.add_trace(go.Pie(values=np.random.rand(3)), row=2, col=2)
fig.update_layout(height=600, width=800, title_text="Two by two subplots")
fig

Note

The plot type must be specified for the supblots, e.g., “xy”, “domain”.

Facet plots#

Facet plots are sets of plots having the same properties execpt for one categorical difference.
Examples can be scatter plots, line plots, histograms, etc. with one distinguishing feature.
Parameters for layout specifications are available.

# Tip dataset from Plotly
df = px.data.tips()
df.head()

	total_bill	tip	sex	smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3
3	23.68	3.31	Male	No	Sun	Dinner	2
4	24.59	3.61	Female	No	Sun	Dinner	4

# Scatter plot with color and facet
# https://plotly.com/python/facet-plots/
fig = px.scatter(df, x="total_bill", y="tip", color='sex', facet_col="day")
fig.update_xaxes(matches=None)
fig

Sunburst plot#

Hierarchical data, e.g., pivoted data, can be displayed as sunbursts.
These are pie charts with concentric circles marking hierarchical relationships.
Interactivity is kind of cool here.

Note

As for ordinary pie charts, it is very hard to judge the relative sizes of sectors in sunburst plots.

# Sunburst plot
df = px.data.tips()
fig = px.sunburst(df, path=['day', 'time', 'sex'], values='total_bill')
fig

# Read the athlete_events.csv file
import pandas as pd
athletes = pd.read_csv('../../data/athlete_events.csv')
winter = athletes.loc[athletes['Season'] == 'Winter',:]
winter2000 = winter.loc[winter['Year'] >= 2000,:]

# Pivoting step on the summer2000 data
w2sy = winter2000.pivot_table(index='Sport', columns='Year', values='Height', aggfunc='count')

# Remove rows that only contain NaN values
w2sy = w2sy.dropna(how='all')

w2syu = w2sy.unstack().reset_index()
w2syu.columns = ['Year', 'Sport', 'Athletes']
w2syu.head()

	Year	Sport	Athletes
0	2002	Alpine Skiing	551
1	2002	Biathlon	564
2	2002	Bobsleigh	238
3	2002	Cross Country Skiing	766
4	2002	Curling	96

fig = px.sunburst(w2syu, path=['Year', 'Sport'], values='Athletes')
# Add header: "Athletes per sport in winter olympics"
fig.update_layout(title_text='Athletes per sport in winter olympics')
fig

Parallel coordinates#

Multiple features in a parallel coordinate system.
Each sample is a line marking values in each feature.
Colours from classes or continuous feature.
Interactivity includes marking part of coordinate axis and rearranging coordinate axes.

# Use Plotly parallell coordinates to visualize the Iris data
# https://plot.ly/python/parallel-coordinates-plot/
df = px.data.iris()
fig = px.parallel_coordinates(df, color="species_id", labels={"species_id": "Species",
                "sepal_width": "Sepal Width", "sepal_length": "Sepal Length",
                "petal_width": "Petal Width", "petal_length": "Petal Length", },
                color_continuous_scale=px.colors.diverging.Tealrose, color_continuous_midpoint=2)
fig

Exercise#

Adjust the above code to include a slider for opacity.

Plotly graphics

Contents

Plotly graphics#

Plotting with AI assistance#

Basic plotting#

Line plot#

Directly editing the dictionary#

Shaded areas#

Bar plot#

Polar barplots#

Scatter plot#

Boxplots and violin plots#

Marginal plots#

Exercise#

Heatmap#

Tables#

Layouts#

Facet plots#

Sunburst plot#

Parallel coordinates#

Exercise#