Note

This page was generated from an Jupyter notebook that can be accessed from github.

Calibrating MESMER on multiple scenarios#

This tutorial shows how to calibrate the parameters for MESMER on an example dataset of coarse regridded ESM output for multiple climate change scenarios. We calibrate the parameters for MESMER using three scenarios: a historical, a low emission (SSP1-2.6), and a high emission (SSP5-8.5) scenario, where SSP5-8.5 includes several ensemble members. You can find the basics of the MESMER approach in Beusch et al. (2020) and the multi-sceario approach in Beusch et al. (2022). Training MESMER consists of four steps:

global trend: compute the global temperature trend, including the volcanic influence on historical trends
global variablity: estimating the parameters to generate global variability
local trend: estimate parameters to translate global mean temperature (including global variability) into local temperature
local variability: estimate parameters needed to generate local variability

This example can be extended to more scenarios, ensemble members and higher resolution data. See also the mesmer calibration test in tests/integration/.

import pathlib

import cartopy.crs as ccrs
import filefisher
import matplotlib.pyplot as plt
import xarray as xr

import mesmer

Load data#

MESMER expects a specific data format. Data from each scenario should be a node (or group) on an xr.DataTree (more on this below) e.g.:

<xarray.DataTree>
Group: /
├── Group: /historical
|    ...
├── Group: /ssp126
|    ...

Each scenario is a xr.Dataset with 4 dimensions: member, time, lat, lon. Below we show one way to load data such that it conforms to the desired format. We load data from the cmip6-ng (“new generation”) repository. This data has undergone a small reformatting from the original cmip6 archive. For the sake of computational speed we also load data which has been regridded to a coarse resolution. Loading the data can be adapted to the data format you are most used to - as long as the final output has the desired format.

MESMER is Earth System Model specific, aiming to reproduce the behaviour of one ESM. Here we train on the CMIP6 output of the model IPSL-CM6A-LR.

model = "IPSL-CM6A-LR"

We use the library filefisher to search all files in the cmip6-ng archive for the model and scenarios we want to use. Filefisher can search through paths for given file patterns. It returns all paths matching the pattern such that you can load the files in the next step.

Here, we want to find all files that have data for annual near surface temperature ("tas") for the used model and the future scenarios ssp126 and ssp585. Next, we search for the historical data that match the members found for the two future scenarios.

# mesmer provides example data under "./data/cmip6-ng"
cmip_data_path = mesmer.example_data.cmip6_ng_path(relative=True)

CMIP_FILEFINDER = filefisher.FileFinder(
    path_pattern=cmip_data_path / "{variable}/{time_res}/{resolution}",
    file_pattern="{variable}_{time_res}_{model}_{scenario}_{member}_{resolution}.nc",
)
CMIP_FILEFINDER

<FileFinder>
path_pattern: '../data/cmip6-ng/{variable}/{time_res}/{resolution}/'
file_pattern: '{variable}_{time_res}_{model}_{scenario}_{member}_{resolution}.nc'

keys: 'member', 'model', 'resolution', 'scenario', 'time_res', 'variable'

Search data for ssp126 and ssp585 - we find one and two ensemble members, respectively:

scenarios = ["ssp126", "ssp585"]

keys = {"variable": "tas", "model": model, "resolution": "g025", "time_res": "ann"}

fc_scens = CMIP_FILEFINDER.find_files(scenario=scenarios, keys=keys)
fc_scens.df

	variable	time_res	resolution	model	scenario	member
path
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_ssp126_r1i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	ssp126	r1i1p1f1
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_ssp585_r1i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	ssp585	r1i1p1f1
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_ssp585_r2i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	ssp585	r2i1p1f1

We also need to find the same ensemble members in the historical data, such that we end up with five files we need to load:

# get the historical members that are also in the future scenarios, but only once
members = fc_scens.df.member.unique()

fc_hist = CMIP_FILEFINDER.find_files(scenario="historical", member=members, keys=keys)

fc_all = fc_hist.concat(fc_scens)
fc_all.df

	variable	time_res	resolution	model	scenario	member
path
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_historical_r1i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	historical	r1i1p1f1
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_historical_r2i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	historical	r2i1p1f1
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_ssp126_r1i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	ssp126	r1i1p1f1
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_ssp585_r1i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	ssp585	r1i1p1f1
../data/cmip6-ng/tas/ann/g025/tas_ann_IPSL-CM6A-LR_ssp585_r2i1p1f1_g025.nc	tas	ann	g025	IPSL-CM6A-LR	ssp585	r2i1p1f1

Now we load all the files we found into a DataTree, a data structure provided by xarray. It is a container to hold xarray Dataset objects that are not alignable. This is useful for us since we have historical and future data, which have different time coordinates. Moreover, the scenarios may also have different numbers of members (as e.g., SSP1-2.6, which only has one). Thus, we store the data of each scenario in a Dataset with all its ensemble members along a member dimension. Then we store all the scenario datasets in one DataTree node. The DataTree allows us to perform computations on each of the scenarios separately.

We define a helper function to load the data from the cmip6_ng example data repository:

def load_data(filecontainer):

    out = xr.DataTree()

    scenarios = filecontainer.df.scenario.unique().tolist()

    # load data for each scenario
    for scen in scenarios:
        files = filecontainer.search(scenario=scen)

        # load all members for a scenario
        members = []
        for fN, meta in files.items():
            time_coder = xr.coders.CFDatetimeCoder(use_cftime=True)
            ds = xr.open_dataset(fN, decode_times=time_coder)
            # drop unnecessary variables
            ds = ds.drop_vars(["height", "time_bnds", "file_qf"], errors="ignore")
            # assign member-ID as coordinate
            ds = ds.assign_coords({"member": meta["member"]})
            members.append(ds)

        # create a Dataset that holds each member along the member dimension
        scen_data = xr.concat(members, dim="member")
        # put the scenario dataset into the DataTree
        out[scen] = xr.DataTree(scen_data)

    return out

This results in the data format discussed above. You can examine it by clicking on Groups above.

We will need some configuration parameters in the following:

THRESHOLD_LAND: threshold above which land fraction to consider a grid point as a land grid point.
REFERENCE_PERIOD: we will work not with absolute temperature values but with temperature anomalies w.r.t. a reference period

THRESHOLD_LAND = 1 / 3
REFERENCE_PERIOD = slice("1850", "1900")

Calculate anomalies#

# calculate anomalies w.r.t. the reference period
tas_anom = mesmer.anomaly.calc_anomaly(dt, reference_period=REFERENCE_PERIOD)

“Smooth” and volcanic components of the global temperature#

The volcanic contributions to the global mean temperature trend of the historical period have to be removed to estimate the linear regression of global mean temperature to local temperature.

Calculate \(T_{t}^{glob,\,smooth}\) using a lowess smoother, with 50 time steps:

# mean over members before smoothing
tas_globmean_ensmean = tas_globmean.mean(dim="member")

n_steps = 50

tas_globmean_smoothed = mesmer.stats.lowess(
    tas_globmean_ensmean,
    dim="time",
    n_steps=n_steps,
    use_coords=False,
)

# plot historical
f, ax = plt.subplots()

h0, *_ = tas_globmean["historical"].tas.plot.line(ax=ax, x="time", color="grey", lw=1)
a2, *_ = tas_globmean_smoothed["historical"].tas.plot.line(ax=ax, x="time", lw=2)

ax.legend([h0, a2], ["Ensemble members", "Smooth ensemble mean"])

<matplotlib.legend.Legend at 0x7c921ac1e9f0>

../_images/3bb6ca5e6add9265509cd3a29e3847d19aefe2131c636ec40102926ee93c665a.png

Fit the parameter of the volcanic contributions only on the historical smoothed data of all ensemble members available. The future scenarios do not have volcanic contributions.

hist_tas_residuals = tas_globmean["historical"] - tas_globmean_smoothed["historical"]

# fit volcanic influence
volcanic_params = mesmer.volc.fit_volcanic_influence(hist_tas_residuals.tas)

volcanic_params.aod

Downloading file 'obs/tau.line_2012.12.txt' from 'https://github.com/MESMER-group/mesmer/raw/v1.0.0rc1/data/obs/tau.line_2012.12.txt' to '/home/docs/.cache/mesmer/v1.0.0rc1'.

<xarray.DataArray 'aod' ()> Size: 8B
array(-1.75209374)

Superimpose the volcanic influence on the historical time series. Because the historical data is treated as its own scenario, we encounter discontinuities at the boundary between historical and future period. However, this is not relevant for the fitting of the parameters hereafter.

# superimpose the volcanic forcing on historical data
tas_globmean_smoothed["historical"] = mesmer.volc.superimpose_volcanic_influence(
    tas_globmean_smoothed["historical"],
    volcanic_params,
)

# plot global mean time series
f, ax = plt.subplots()

# plot unsmoothed global means
tas_globmean["historical"].tas.plot.line(
    ax=ax, lw=1, x="time", color="0.5", add_legend=False
)
tas_globmean["ssp126"].tas.plot.line(
    ax=ax, lw=1, x="time", color="#6baed6", add_legend=False
)
tas_globmean["ssp585"].tas.plot.line(
    ax=ax, lw=1, x="time", color="#fc9272", add_legend=False
)

# plot smoothed global means including volcanic influence for historical
tas_globmean_smoothed["historical"].tas.plot.line(
    ax=ax, lw=1.5, x="time", color="0.1", label="historical"
)
tas_globmean_smoothed["ssp126"].tas.plot.line(
    ax=ax, lw=1.5, x="time", color="#08519c", label="ssp126"
)
tas_globmean_smoothed["ssp585"].tas.plot.line(
    ax=ax, lw=1.5, x="time", color="#de2d26", label="ssp585"
)

# histend = tas_globmean["historical"].time.isel(time=-1).item()
# ax.axvline(histend, color="0.4")
ax.axhline(0, color="0.1", lw=0.5)

ax.set_title("")
plt.legend(loc="upper left")

<matplotlib.legend.Legend at 0x7c92170913a0>

../_images/669f874fe7994ef6c11369195b3b3d1b06cba7cf314cab599656a515afecfcba.png

Calculate residuals (w.r.t. smoothed ts) i.e. remove the smoothed global mean, including the volcanic influence from the anomalies.

tas_globmean_resids = tas_globmean - tas_globmean_smoothed
# rename to tas_resids
tas_globmean_resids = mesmer.datatree.map_over_datasets(
    lambda ds: ds.rename({"tas": "tas_resids"}), tas_globmean_resids
)

# plot residuals
h0, *_ = tas_globmean_resids["historical"].tas_resids.plot.line(
    x="time", color="0.5", lw=1, add_legend=False
)
h1, *_ = tas_globmean_resids["ssp126"].tas_resids.plot.line(
    x="time", color="#08519c", lw=1, add_legend=False
)
h2, *_ = tas_globmean_resids["ssp585"].tas_resids.plot.line(
    x="time", color="#de2d26", lw=1, add_legend=False
)

plt.title("Residuals")
plt.axhline(0, lw=1, color="0.1")

plt.legend([h0, h1, h2], ["historical", "ssp126", "ssp585"])

<matplotlib.legend.Legend at 0x7c92160d6fc0>

../_images/b376cb5d405bd17060d6fad3a068804c84fb7a135e6a930dd955e3bcc10ef284.png

Saving the parameters#

Finally, we have calibrated all needed parameters and can save them. We can use filefisher to nicely create file names and save the parameters.

# define path relative to this notebook & create folder
param_path = pathlib.Path("./output/calibrated_parameters/")

PARAM_FILEFINDER = filefisher.FileFinder(
    path_pattern=param_path / "{esm}_{scen}",
    file_pattern="params_{module}_{esm}_{scen}.nc",
)

scen_str = "-".join(scenarios)

folder = PARAM_FILEFINDER.create_path_name(esm=model, scen=scen_str)
pathlib.Path(folder).mkdir(exist_ok=True, parents=True)

params = {
    "volcanic": volcanic_params,
    "global-variability": global_ar_params,
    "local-trends": local_lin_reg,
    "local-variability": local_ar,
    "covariance": localized_ecov,
    "grid-orig": grid_orig,
}


save_files = False  # we don't save them here in the example
if save_files:

    for module, param in params.items():

        filename = PARAM_FILEFINDER.create_full_name(
            module=module,
            esm=model,
            scen=scen_str,
        )

        param.to_netcdf(filename)

When you want to use the calibrated parameters for emulation, see the Tutorials for emulating one or multiple scenarios in the Tutorial section next.

Calibrating MESMER on multiple scenarios

Contents

Calibrating MESMER on multiple scenarios#

Load data#

Calculate anomalies#

Global mean#

“Smooth” and volcanic components of the global temperature#

Global variability#

Local forced response#

Local variability#

Estimate the AR parameters#

Estimate covariance matrix#

Saving the parameters#