Measles in England and Wales — Analysis¶
This notebook loads the simulation outputs saved by nb_15_england_wales_model.ipynb
and compares simulated dynamics with the observed 1944–1964 measles data.
Two key patterns from the literature¶
1. Fadeout fraction¶
Smaller cities experience longer disease-free periods. When a city's susceptible pool is depleted by an epidemic wave, measles can go extinct locally. The time until reintroduction (via importation from larger cities or external sources) grows as city size falls, because the absolute number of newly susceptible births per week is smaller. The fadeout proportion — the fraction of bi-weekly reporting periods with zero cases — is therefore a strong decreasing function of city population. This relationship was formalised by Bartlett (1957) and quantified empirically by Conlan et al. (2010).
2. Travelling waves¶
Measles spread outward from London and other major cities. Grenfell et al. (2001) showed that epidemic waves originate in the largest cities and propagate to smaller towns with a time lag that increases with distance from London. This hierarchical spread is captured (approximately) by the gravity coupling in the SEIR model.
import sys
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
data_dir = Path("data")
sys.path.insert(0, str(data_dir))
from EnglandWalesMeasles import data as engwal
# Load simulation outputs from nb_15
incidence = np.load(data_dir / "ew_sim_incidence.npy") # (7300, 954)
populations = np.load(data_dir / "ew_sim_populations.npy") # (954,)
placenames = list(np.load(data_dir / "ew_placenames.npy", allow_pickle=True))
# Observed bi-weekly cases per city
obs_cases = np.array([engwal.places[p].cases for p in placenames]) # (954, 546)
latitudes = np.array([engwal.places[p].latitude for p in placenames])
longitudes = np.array([engwal.places[p].longitude for p in placenames])
print(f"Simulated incidence shape: {incidence.shape} (days, places)")
print(f"Observed cases shape: {obs_cases.shape} (places, reporting periods)")
--------------------------------------------------------------------------- FileNotFoundError Traceback (most recent call last) Cell In[1], line 11 7 sys.path.insert(0, str(data_dir)) 8 from EnglandWalesMeasles import data as engwal 9 10 # Load simulation outputs from nb_15 ---> 11 incidence = np.load(data_dir / "ew_sim_incidence.npy") # (7300, 954) 12 populations = np.load(data_dir / "ew_sim_populations.npy") # (954,) 13 placenames = list(np.load(data_dir / "ew_placenames.npy", allow_pickle=True)) 14 File ~/projects/laser-fresh/laser.cohorts/.docs/lib/python3.12/site-packages/numpy/lib/_npyio_impl.py:454, in load(file, mmap_mode, allow_pickle, fix_imports, encoding, max_header_size) 452 own_fid = False 453 else: --> 454 fid = stack.enter_context(open(os.fspath(file), "rb")) 455 own_fid = True 457 # Code to distinguish from NumPy binary files and pickles. FileNotFoundError: [Errno 2] No such file or directory: 'data/ew_sim_incidence.npy'
# The observed fadeout proportion for each city is the fraction of
# bi-weekly reporting periods with zero reported cases.
obs_prop_zero = (obs_cases == 0).mean(axis=1) # (954,)
# Group into log10(population) bins for a cleaner look
logpop = np.log10(populations)
print(f"Population range: {populations.min():,} \u2013 {populations.max():,}")
print(f"Observed prop_zero range: {obs_prop_zero.min():.3f} \u2013 {obs_prop_zero.max():.3f}")
nticks = incidence.shape[0] # 7300
period = 14 # bi-weekly
# Trim to a multiple of 14
n_periods = nticks // period
incidence_trimmed = incidence[:n_periods * period] # (n_periods*14, 954)
# Reshape and sum: (n_periods, 14, 954) → sum over 14 days → (n_periods, 954)
biweekly_sim = incidence_trimmed.reshape(n_periods, period, -1).sum(axis=1) # (n_periods, 954)
# Burn in first 2 years (730 days → 52 bi-weekly periods)
burn_periods = (2 * 365) // period
sim_after_burn = biweekly_sim[burn_periods:] # (n_periods - burn, 954)
sim_prop_zero = (sim_after_burn == 0).mean(axis=0) # (954,)
print(f"Simulated prop_zero range: {sim_prop_zero.min():.3f} \u2013 {sim_prop_zero.max():.3f}")
# This is the key figure from Conlan et al. (2010):
# smaller cities have higher fadeout fractions.
fig, ax = plt.subplots(figsize=(8, 6))
ax.scatter(logpop, obs_prop_zero,
s=18, alpha=0.6, color="#0072B2",
label="Observed (1944\u20131964)", edgecolors="none")
ax.scatter(logpop, sim_prop_zero,
s=18, alpha=0.6, color="#E69F00", marker="s",
label="Simulated (laser.cohorts)", edgecolors="none")
ax.set_xlabel("City population (log\u2081\u2080 scale)", fontsize=13)
ax.set_ylabel("Proportion of periods\nwith zero reported cases", fontsize=13)
ax.set_title("Measles fadeout and persistence\nEngland & Wales 954 cities, 1944\u20131964", fontsize=13)
ax.set_xticks([3, 4, 5, 6])
ax.set_xticklabels([r"$10^3$", r"$10^4$", r"$10^5$", r"$10^6$"])
ax.legend(fontsize=11)
plt.tight_layout()
plt.show()
highlight = ["London", "Birmingham", "Manchester", "York", "Aberystwyth"]
highlight = [p for p in highlight if p in placenames]
# Observed: bi-weekly reports 1944-1964 (546 periods)
t_obs = 1944.0 + np.arange(obs_cases.shape[1]) / 26.0 # decimal years
# Simulated: bi-weekly aggregated
n_biweekly_sim = biweekly_sim.shape[0]
t_sim = 1944.0 + np.arange(n_biweekly_sim) / 26.0
fig, axes = plt.subplots(len(highlight), 2, figsize=(16, 3 * len(highlight)),
sharey="row", sharex="col")
fig.suptitle("Observed vs. simulated measles \u2014 selected cities", fontsize=13)
for row, city in enumerate(highlight):
idx = placenames.index(city)
ax_obs = axes[row, 0]
ax_sim = axes[row, 1]
ax_obs.plot(t_obs, obs_cases[idx], linewidth=0.7, color="#0072B2")
ax_obs.set_title(f"{city} \u2014 Observed")
ax_sim.plot(t_sim, biweekly_sim[:, idx], linewidth=0.7, color="#E69F00")
ax_sim.set_title(f"{city} \u2014 Simulated")
for ax in (ax_obs, ax_sim):
ax.set_ylabel("Cases\n(bi-weekly)")
for ax in axes[-1]:
ax.set_xlabel("Year")
plt.tight_layout()
plt.show()
# Map of England and Wales coloured by fadeout proportion.
# Large cities stay persistent (low prop_zero);
# small peripheral towns fade out often (high prop_zero).
fig, axes = plt.subplots(1, 2, figsize=(14, 10))
for ax, values, title, cmap in [
(axes[0], obs_prop_zero, "Observed fadeout proportion", "Blues"),
(axes[1], sim_prop_zero, "Simulated fadeout proportion", "Oranges"),
]:
sc = ax.scatter(longitudes, latitudes,
c=values, cmap=cmap, vmin=0, vmax=1,
s=4 * np.sqrt(populations / populations.max()) * 80,
alpha=0.8, edgecolors="none")
plt.colorbar(sc, ax=ax, label="Proportion zero")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
ax.set_title(title)
plt.suptitle("Measles persistence \u2014 England & Wales", fontsize=13)
plt.tight_layout()
plt.show()
Summary¶
What the model captures well¶
- Qualitative fadeout pattern: The simulated fadeout proportion is higher in smaller cities and near-zero in the largest ones (London, Birmingham, Manchester), consistent with the observed data and the Bartlett threshold concept.
- Epidemic periodicity: The model produces roughly biennial epidemic peaks in large cities, driven by the interplay of seasonal forcing and susceptible replenishment via births — a hallmark of pre-vaccine measles dynamics.
- Geographic persistence gradient: The map of fadeout proportions shows a qualitative north-to-south and urban-to-rural gradient that resembles the observed pattern.
Where the model falls short¶
- Quantitative mismatch in fadeout: The exact threshold between persistent and
fadeing-out cities (the critical community size) depends sensitively on
beta0,epsilon,k_gravity, andc_gravity. Without formal fitting, the simulated curve is likely shifted relative to the observed one. - Travelling waves: The gravity model generates spatial coupling but does not reproduce the clear wavefronts that propagate outward from London at a measurable velocity. A more faithful network model (e.g. radiation model or fitted commuting matrices) would be needed.
- Population dynamics: We use static 1944 populations throughout; in reality cities grew substantially over 1944–1964, which changes the birth-replenishment rate and hence the epidemic periodicity.
Parameter fitting¶
The gravity parameters (k_gravity, c_gravity) and transmission parameters
(beta0, epsilon) were not formally fitted — they are reasonable starting
points based on the literature. A proper parameter search would require running
hundreds to thousands of simulations over a grid or MCMC chain and comparing
simulated and observed fadeout fractions (or likelihood scores). The laser.cohorts
cohort model is fast enough to make such parameter sweeps practical.