Visualise the Flora Danica Cleaned Metadata

Visualise the Flora Danica Cleaned Metadata#

Script Summary#

This notebook visualizes the cleaned Flora Danica metadata to explore patterns in botanical documentation, author contributions, and taxonomic distribution.

Steps:

Load the cleaned Flora Danica dataset from the CSV file created in the previous notebook
Visualize author contributions by counting and plotting the number of published plates per author
Visualize taxonomic groups by counting and plotting the distribution of plant groups (filtered to show groups with more than 15 plates)
Create interactive visualizations using Plotly Express for exploration

Outputs: Interactive bar charts showing author contributions and taxonomic group distributions, enabling exploration of patterns in the Flora Danica collection.

Dataset: To download the whole Flora Danica dataset follow go to the Library’s Open Access Repository. Click here to download the cleaned Flora Danica dataset created in the previous notebook.

Digital Collection: To browse the images of the Flora Danica Collection Online go to the Digital collections

Import Libraries#

Import the required libraries for data manipulation and visualization.

Before you start:
Before running this script, please note that the script uses the following libraries, which are not part of the standard packages. These can be installed from the notebook using pip install. Create a new cell and run the following:

!pip install matplotlib plotly-express

Once the libraries are successfully installed, the script can be continued.

# Import libraries
import pandas as pd
import plotly.express as px

Load the Dataset#

The cleaned dataset is loaded from the CSV file created in the previous notebook; Clean the Flora Danica dataset.

# Load the cleaned dataset
subset_df = pd.read_csv(r'.\mekuni_flora_danica_data\flora_danica_tidy_format.csv')

Visualize Authors#

Count the number of published plates per author and create a bar chart to visualize the distribution. Plotly enables the creation of interactive plots that can be explored by hovering over data points and zooming.

# Count and plot authors
author_data_in = subset_df['author_st'].value_counts().to_frame().reset_index()

fig = px.bar(author_data_in, x='author_st', y='count',
             color_discrete_sequence=["#13dbb7"],
                 title='Authors and Number of Published Plates')


# Update layout for better readability
fig.update_layout(
    xaxis_title='Authors',
    yaxis_title='Count',
    xaxis_tickangle=-45,  # Rotate x-axis labels for better readability
    yaxis=dict(showgrid=True, gridcolor="#45d6d3", gridwidth=0.5),  # Add gridlines
    plot_bgcolor='white'  # Set the background color to white
)

fig.show()

Visualize Taxonomic Groups#

Count the number of plates per taxonomic group, filter to show only groups with more than 15 plates, and create a bar chart to visualize the distribution.

taxonomy_data_in = subset_df['taxonomic_group_st'].value_counts().to_frame().reset_index()
taxonomy_data_in = taxonomy_data_in.query('count > 15')

# Use Plotly to create a chart
fig = px.bar(taxonomy_data_in,x='taxonomic_group_st', y='count',
             color_discrete_sequence=["#ea7600"],
                 title='Plant Groups')


# Customize layout to improve readability
fig.update_layout(
    xaxis_title='Taxonomic Group',
    yaxis_title='Count',
    xaxis_tickangle=-45,  # Rotate x-axis labels
    yaxis=dict(showgrid=True, gridcolor="#45d6d3", gridwidth=0.5),  # Add gridlines
    plot_bgcolor='white'  # Set background color to white
)

fig.show()

Other studies#

The Flora Danica metadata visualizations can be extended with several complementary analyses:

Create time-series visualizations to show how plant documentation evolved over different publication periods and identify trends in botanical knowledge.
Develop network visualizations to explore relationships between authors, taxonomic groups, and publication issues.
Create geographic visualizations if location data becomes available to map the distribution of documented plants.
Build interactive dashboards combining multiple visualizations to enable comprehensive exploration of the collection.
Analyze the relationship between Latin and Danish nomenclature through comparative visualizations.
Create heatmaps showing the distribution of plant families across different authors or time periods.
Develop treemap visualizations to show hierarchical relationships in taxonomic classification.
Explore publication patterns by visualizing the distribution of plates across different issues and volumes.