How to create a mask for CESM1’s “urban areas”?

This script is used for creating a urban mask at the global scale for CESM1 data.

Step 0: load necessary packages and define parameters

%matplotlib inline
import warnings
import intake
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt

# define parameters for data retrieval
catalog_url = ''
experiment = "RCP85"
frequency = "daily"
urban_variable = "TREFMXAV_U"
cam_variable = "TREFHT"

Step 1: load datasets

col = intake.open_esm_datastore(catalog_url)
col_subset =, frequency=frequency, variable=urban_variable)
dsets = col_subset.to_dataset_dict(zarr_kwargs={"consolidated": True},
                                   storage_options={"anon": True})["lnd.RCP85.daily"]

--> The keys in the returned dictionary of datasets are constructed as follows:
100.00% [1/1 00:00<00:00]
Step 2: find the urban gridcell
Given that urban gridcell is time-invariant, let’s use member_id = 2 and time="2006-01-02"
da = dsets.sel(member_id=2, time="2006-01-02")[urban_variable].load()
<matplotlib.collections.QuadMesh at 0x2abf60495790>
Step 3: save the urban mask
The file is save at current working directory, with a file name “”

Step 4: load the urban mask

mask = xr.open_dataset("./")["mask"]
<matplotlib.collections.QuadMesh at 0x2abf60a2df10>

Step 5: apply the urban mask to CAM

load CAM data

col_subset =, frequency=frequency, variable=cam_variable)
dsets = col_subset.to_dataset_dict(zarr_kwargs={"consolidated": True},
                                   storage_options={"anon": True})['atm.RCP85.daily']
da_cam = dsets.sel(member_id=2, time="2006-01-02")[cam_variable].load()

--> The keys in the returned dictionary of datasets are constructed as follows:
100.00% [1/1 00:00<00:00]
<matplotlib.collections.QuadMesh at 0x2abf6164fd60>

apply the mask to CAM data and calculate the difference

da_cam_urban = da_cam.where(mask)

fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15,4))
(da-da_cam_urban).rename("Urban Heat Island").plot(ax=ax3, cmap="bwr")

check the dimension

print("city number:", da.to_dataframe().dropna().shape[0])
assert (da-da_cam_urban).rename("Urban Heat Island").to_dataframe().dropna().shape[0] == 4439
city number: 4439