# Plotly graphics
- One of the major players in interactive graphs is [Plotly](https://plotly.com/python/).
- Some alternatives are Bokeh and Altair.
- Interfacing it comes in two main flavours:
    - _graph_objects_: low-level graphics handling
    - _plotly.express_: high-level graphics handling
- In addition _plotly_ is integrated in the _dash_ environment with its dialect.
- Figures are dictionaries, which we will leverage.

In [None]:
# The following renders plotly graphs in Jupyter Notebook, Jupyter Lab and VS Code formats
import plotly.io as pio
pio.renderers.default = "notebook+plotly_mimetype"

## Plotting with AI assistance
- Many plot commands can be obtained by describing plots to AIs.
- AIs can also translate from one plotting framework to another.
- Sketching a set of plot and adding sufficient descriptions, may result in usable code.

## Basic plotting

In [None]:
# Gapminder dataset of health and wealth stats for different countries
import plotly.express as px
df = px.data.gapminder()
df.head()

### Line plot

In [None]:
# Create a line plot of life expectancy over time for Norway.
# Let the figure be 400 pixels high and 700 pixels wide.
# Set the title to 'Life Expectancy in Norway'.
# Set the x-axis label to 'Year'.
# Set the y-axis label to 'Life Expectancy (years)'.
fig = px.line(df[df['country'] == 'Norway'], x='year', y='lifeExp', title='Life Expectancy in Norway', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig

In [None]:
# Create a plot with one line for Norway and one line for Sweden in the same style as the plot above.
# Let the legend title be 'Country'.
fig = px.line(df[df['country'].isin(['Norway', 'Sweden'])], x='year', y='lifeExp', color='country', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')
fig

In [None]:
# The dictionary defining the figure
print(fig)

In [None]:
print(fig['data'][0]['line']['color'])

### Directly editing the dictionary

In [None]:
fig['data'][0]['line']['color'] = "#000000"

In [None]:
fig

### Shaded areas
- The fill parameter can be used to fill to the next line, to zero or to itself if the series reverses.
- The latter is most convenient for single colour background shading.

In [None]:
# Plot the mean life expectancy in Europe over time. Shade the area between minimum and maximum life expectancy in Europe over time.
# Overlay Norway's life expectancy over the plot.
# https://plotly.com/python/continuous-error-bars/
dfE = df[df['continent'] == 'Europe'][['year', 'lifeExp']].groupby('year')
dfEmean = dfE.mean().reset_index()
dfEmean['Legend'] = 'Average' # Hack to include line in legend, see color below.

fig = px.line(dfEmean, x='year', y='lifeExp', title='Life Expectancy in Europe', color='Legend', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')

# Fill between dfE.min().reset_index() and dfE.max().reset_index()
fig.add_scatter(x=dfE.min().reset_index()['year'], y=dfE.min().reset_index()['lifeExp'], name='Min', fill='tonexty')
fig.add_scatter(x=dfE.max().reset_index()['year'], y=dfE.max().reset_index()['lifeExp'], name='Max', fill='tonexty')
fig.add_scatter(x=df[df['country'] == 'Norway']['year'], y=df[df['country'] == 'Norway']['lifeExp'], name='Norway')
fig

```{note}
Look at the way .reset_index() is used to promote years back to a variable again. 
```

### Bar plot

In [None]:
# Make a barplot of the life expectancy in Norway over time.
fig = px.bar(df[df['country'] == 'Norway'], x='year', y='lifeExp', title='Life Expectancy in Norway', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig

In [None]:
# Make a barplot with both Norway and Sweden in the same plot. Let the countries be side by side for each year.
fig = px.bar(df[df['country'].isin(['Norway', 'Sweden'])], x='year', y='lifeExp', color='country', barmode='group', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')
fig

```{note}
Remove "barmode" for stacking.
```

In [None]:
# Create a barplot with maximum life expectancy in Europe for each year.
# Overlay the life expectancy in Bulgaria over the plot with narrower bars using barmode='overlay'.
dfE = df[df['continent'] == 'Europe'][['year', 'lifeExp']].groupby('year')
dfEmax = dfE.max().reset_index()
dfEmax['Bulgaria'] = df[df['country'] == 'Bulgaria']['lifeExp'].reset_index()['lifeExp']
dfEmax.columns = ['year', 'Europe max', 'Bulgaria']
fig = px.bar(dfEmax, x='year', y=['Europe max', 'Bulgaria'], title='Life Expectancy in Europe', barmode='overlay', width=700, height=400)
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Life Expectancy (years)')
fig.update_layout(legend_title_text='Country')
fig

In [None]:
# Inspect the figure
print(fig)

In [None]:
# Adust the width of the Bulgaria bars to 2.
fig['data'][1]['width'] = 1.5
fig

### Polar barplots
- The x-axis in barplots do not have to be straight.

In [None]:
angles = (dfEmax['year']-1952)/55*360*11/12
width = [360/12-5]*12
r = dfEmax['Europe max']

In [None]:
import plotly.graph_objects as go

fig = go.Figure(go.Barpolar(
    r=r,
    theta=angles,
    width=width,
    marker_color=dfEmax['Europe max'],
    marker_line_color="black",
    marker_line_width=2,
    opacity=0.8
))

fig.update_layout(
    template=None,
    polar = dict(
        radialaxis = dict(range=[0, 100], showticklabels=False, ticks=''),
        angularaxis = dict(showticklabels=False, ticks='')
    )
)

fig

# Change me to plotly express, please!

### Scatter plot

In [None]:
# Create a Plotly express scatter plot of the iris data
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color='species')
fig.update_xaxes(title='Sepal width')
fig.update_yaxes(title='Sepal length')
fig

In [None]:
# Inspect the scatter plot.
# Note three legendgroups and the markers. Many more options are available.
print(fig)

In [None]:
# Manipulate symbols
fig = px.scatter(df, x="sepal_width", y="sepal_length", 
                 color='species', size="petal_width")
fig.update_xaxes(title='Sepal width')
fig.update_yaxes(title='Sepal length')
fig

## Boxplots and violin plots

In [None]:
# Make a boxplot of the life expectancy per country in Europe
df = px.data.gapminder()
dfE = df[df['continent'] == 'Europe']
fig = px.box(dfE, x='country', y='lifeExp', title='Life Expectancy in Europe', width=800, height=500)
fig.update_xaxes(title='Country')
fig.update_yaxes(title='Life Expectancy (years)')
fig

In [None]:
# Make a violinplot of the life expectancy per country in Europe 
# with the same style as the boxplot above.
fig = px.violin(dfE, x='country', y='lifeExp', title='Life Expectancy in Europe', width=800, height=400)
fig.update_xaxes(title='Country')
fig.update_yaxes(title='Life Expectancy (years)')
fig

### Marginal plots
- Scatter plots support simple marginal plots, e.g., histograms and similar.

In [None]:
# Add a marginal violin plot to the scatter plot.
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", 
                 color='species', size="petal_width", marginal_y='box')
fig.update_xaxes(title='Sepal width')
fig.update_yaxes(title='Sepal length')
fig

## Exercise
- Test other marginal plot types and locations.

## Heatmap

In [None]:
# Make a correlation heatmap of the iris data
df = px.data.iris()
fig = px.imshow(df.corr(numeric_only=True))
fig

## Tables
- One can plot tables with styling.

In [None]:
# Make a Plotly express table view for the iris data
# https://plotly.com/python/table
import plotly.graph_objects as go
df = px.data.iris()
fig = go.Figure(data=[go.Table(
    header=dict(values=list(df.columns),
                fill_color='paleturquoise',
                align='left'),
    cells=dict(values=[df.sepal_length, df.sepal_width, df.petal_length, df.petal_width, df.species, df.species_id],
               fill_color='lavender',
               align='left'))
])

fig

## Layouts
- For Plotly express there is no direct layout option, except for facets (see below).
- Instead one need to go to the low-level graph objects.

In [None]:
# Make a two by two plotly express plot with two scatter plots and two pie charts, all four with random data
# https://plotly.com/python/subplots/
import plotly.graph_objects as go
import numpy as np
from plotly.subplots import make_subplots
np.random.seed(1)
# Initialize figure with subplots with type of plot in each cell
fig = make_subplots(rows=2, cols=2, 
                    specs=[[{"type": "xy"}, {"type": "xy"}], 
                           [{"type": "domain"}, {"type": "domain"}]])
fig.add_trace(go.Scatter(x=np.random.rand(100), y=np.random.rand(100), mode='markers'), row=1, col=1)
fig.add_trace(go.Scatter(x=np.random.rand(100), y=np.random.rand(100), mode='markers'), row=1, col=2)
fig.add_trace(go.Pie(values=np.random.rand(3)), row=2, col=1)
fig.add_trace(go.Pie(values=np.random.rand(3)), row=2, col=2)
fig.update_layout(height=600, width=800, title_text="Two by two subplots")
fig

```{note}
The plot type must be specified for the supblots, e.g., "xy", "domain". 
```

## Facet plots
- Facet plots are sets of plots having the same properties execpt for one categorical difference.
- Examples can be scatter plots, line plots, histograms, etc. with one distinguishing feature.
- Parameters for layout specifications are available.

In [None]:
# Tip dataset from Plotly
df = px.data.tips()
df.head()

In [None]:
# Scatter plot with color and facet
# https://plotly.com/python/facet-plots/
fig = px.scatter(df, x="total_bill", y="tip", color='sex', facet_col="day")
fig.update_xaxes(matches=None)
fig

## Sunburst plot
- Hierarchical data, e.g., pivoted data, can be displayed as sunbursts.
- These are pie charts with concentric circles marking hierarchical relationships.
- Interactivity is kind of cool here.

```{note}
As for ordinary pie charts, it is very hard to judge the relative sizes of sectors in sunburst plots. 
```

In [None]:
# Sunburst plot
df = px.data.tips()
fig = px.sunburst(df, path=['day', 'time', 'sex'], values='total_bill')
fig

In [None]:
# Read the athlete_events.csv file
import pandas as pd
athletes = pd.read_csv('../../data/athlete_events.csv')
winter = athletes.loc[athletes['Season'] == 'Winter',:]
winter2000 = winter.loc[winter['Year'] >= 2000,:]

# Pivoting step on the summer2000 data
w2sy = winter2000.pivot_table(index='Sport', columns='Year', values='Height', aggfunc='count')

# Remove rows that only contain NaN values
w2sy = w2sy.dropna(how='all')

w2syu = w2sy.unstack().reset_index()
w2syu.columns = ['Year', 'Sport', 'Athletes']
w2syu.head()

In [None]:
fig = px.sunburst(w2syu, path=['Year', 'Sport'], values='Athletes')
# Add header: "Athletes per sport in winter olympics"
fig.update_layout(title_text='Athletes per sport in winter olympics')
fig

## Parallel coordinates
- Multiple features in a parallel coordinate system.
- Each sample is a line marking values in each feature.
- Colours from classes or continuous feature.
- Interactivity includes marking part of coordinate axis and rearranging coordinate axes.

In [None]:
# Use Plotly parallell coordinates to visualize the Iris data
# https://plot.ly/python/parallel-coordinates-plot/
df = px.data.iris()
fig = px.parallel_coordinates(df, color="species_id", labels={"species_id": "Species",
                "sepal_width": "Sepal Width", "sepal_length": "Sepal Length",
                "petal_width": "Petal Width", "petal_length": "Petal Length", },
                color_continuous_scale=px.colors.diverging.Tealrose, color_continuous_midpoint=2)
fig

## Exercise
- Adjust the above code to include a slider for opacity.

```{seealso} Resources
:class: tip
- [Plotly overview](https://plotly.com/python/)
- [Plotly API reference](https://plotly.com/python-api-reference/index.html)
```

In [None]:
# Dummy cell to ensure Plotly graphics are shown
import plotly.graph_objects as go
f = go.FigureWidget([go.Scatter(x=[1,1], y=[1,1], mode='markers')])