One Forecast, Two Targets
A single forecast can be evaluated against multiple targets in one run
by creating one EvaluationObject per target. This pattern is useful
when you want to compare gridded reanalysis skill against point
observation skill side-by-side, or when different event types share the
same forecast but require distinct ground-truth sources. The full
evaluation pipeline is described in Usage.
Why two targets?
ERA5 and GHCN answer different questions about the same forecast:
- ERA5 (gridded reanalysis) measures how well the model captures the spatial temperature field over a region.
- GHCN (station observations) measures how well the model matches actual thermometer readings at weather stations.
Running both in a single pass is more efficient than separate runs because EWB opens the forecast data once and caches it across case operators.
Example — Heat wave skill against ERA5 and GHCN
import extremeweatherbench as ewb
forecast = ewb.ZarrForecast(
source="gs://weatherbench2/datasets/hres/2016-2022-0012-1440x721.zarr",
name="HRES",
variable_mapping=ewb.HRES_metadata_variable_mapping,
storage_options={"remote_options": {"anon": True}},
)
# Gridded ERA5 target
era5_target = ewb.ERA5(variables=["surface_air_temperature"])
# Point observation GHCN target
ghcn_target = ewb.GHCN()
shared_metrics = [
ewb.metrics.MeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
ewb.metrics.MaximumMeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
]
eval_objects = [
ewb.EvaluationObject(
event_type="heat_wave",
metric_list=shared_metrics,
target=era5_target,
forecast=forecast,
),
ewb.EvaluationObject(
event_type="heat_wave",
metric_list=shared_metrics,
target=ghcn_target,
forecast=forecast,
),
]
cases = ewb.load_cases()
runner = ewb.evaluation(case_metadata=cases, evaluation_objects=eval_objects)
outputs = runner.run_evaluation()
The output DataFrame has a target_source column that distinguishes
rows from each target ("ERA5" vs "GHCN"), making it straightforward
to compare them:
era5_results = outputs[outputs["target_source"] == "ERA5"]
ghcn_results = outputs[outputs["target_source"] == "GHCN"]
Detailed Explanation: Each
EvaluationObjectexpands into oneCaseOperatorper case. With twoEvaluationObjectsand 337 cases you get 674 operators; they share the forecast source, so IO is not doubled. However, GHCN and ERA5 alignment happens independently for each target — GHCN uses nearest-neighbour interpolation to match station locations, while ERA5 uses spatial regridding to the forecast grid. Thetarget_sourcecolumn is set from thenameattribute on the target object.
Example — Mixed event types with distinct targets
You can combine different event types and different targets in a single run. Here, heat wave skill is evaluated against ERA5, while freeze skill is evaluated against GHCN:
import extremeweatherbench as ewb
forecast = ewb.ZarrForecast(
source="gs://weatherbench2/datasets/hres/2016-2022-0012-1440x721.zarr",
name="HRES",
variable_mapping=ewb.HRES_metadata_variable_mapping,
storage_options={"remote_options": {"anon": True}},
)
eval_objects = [
ewb.EvaluationObject(
event_type="heat_wave",
metric_list=[
ewb.metrics.MaximumMeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
],
target=ewb.ERA5(variables=["surface_air_temperature"]),
forecast=forecast,
),
ewb.EvaluationObject(
event_type="freeze",
metric_list=[
ewb.metrics.MinimumMeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
],
target=ewb.GHCN(),
forecast=forecast,
),
]
cases = ewb.load_cases()
runner = ewb.evaluation(case_metadata=cases, evaluation_objects=eval_objects)
outputs = runner.run_evaluation()
EWB matches each case's event_type field against the
event_type on each EvaluationObject, so heat wave cases run only
against the ERA5 target, and freeze cases run only against GHCN.
Metrics that differ by target
Some metrics are more appropriate for gridded targets (spatial
displacement, RMSE over a grid) and others for point observations (MAE
at station locations). Supply different metric_list values to each
EvaluationObject accordingly:
eval_objects = [
ewb.EvaluationObject(
event_type="heat_wave",
metric_list=[
ewb.metrics.RootMeanSquaredError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
],
target=ewb.ERA5(variables=["surface_air_temperature"]),
forecast=forecast,
),
ewb.EvaluationObject(
event_type="heat_wave",
metric_list=[
ewb.metrics.MeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
ewb.metrics.MaximumMeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
],
target=ewb.GHCN(),
forecast=forecast,
),
]
Complete Example
HRES evaluated against ERA5 and GHCN simultaneously for all heat wave cases.
import datetime
import extremeweatherbench as ewb
from extremeweatherbench.cases import IndividualCase
from extremeweatherbench.regions import BoundingBoxRegion
demo_case = IndividualCase(
case_id_number=9009,
title="2022 India Heat Wave (demo)",
start_date=datetime.datetime(2022, 4, 28),
end_date=datetime.datetime(2022, 5, 1),
location=BoundingBoxRegion.create_region(
latitude_min=24.0,
latitude_max=30.0,
longitude_min=76.0,
longitude_max=82.0,
),
event_type="heat_wave",
)
cases = [demo_case]
forecast = ewb.ZarrForecast(
source="gs://weatherbench2/datasets/hres/2016-2022-0012-1440x721.zarr",
name="HRES",
variable_mapping=ewb.HRES_metadata_variable_mapping,
storage_options={"remote_options": {"anon": True}},
)
era5_target = ewb.ERA5(
variables=["surface_air_temperature"]
)
ghcn_target = ewb.GHCN()
shared_metrics = [
ewb.metrics.MeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
ewb.metrics.MaximumMeanAbsoluteError(
forecast_variable="surface_air_temperature",
target_variable="surface_air_temperature",
),
]
eval_objects = [
ewb.EvaluationObject(
event_type="heat_wave",
metric_list=shared_metrics,
target=era5_target,
forecast=forecast,
),
ewb.EvaluationObject(
event_type="heat_wave",
metric_list=shared_metrics,
target=ghcn_target,
forecast=forecast,
),
]
runner = ewb.evaluation(
case_metadata=cases,
evaluation_objects=eval_objects,
)
outputs = runner.run_evaluation()
mae = outputs[outputs["metric"] == "MeanAbsoluteError"]
for source in ["ERA5", "GHCN"]:
mean_mae = mae[
mae["target_source"] == source
]["value"].mean()
print(f"{source:6s} mean MAE: {mean_mae:.4f} K")