i7aof.biascorr

Purpose: Bias-correction methods (time-slice statistics and projection workflows) to align model fields with a reference climatology, producing corrected temperature (thetao) and salinity (so) on the ISMIP grid.

Public Python API (by module)

Import paths and brief descriptions by module:

  • Module: i7aof.biascorr.timeslice

    • Timeslice: A container and helper for extracting per-year (or mean) gridded T and S, computing 3D volume, binning volume in T–S space by basin, and building per-basin bias anomalies relative to a model reference period.

      • Key methods:

        • get_all_data() — convenience to call T/S extraction, volume, and binning.

        • get_T_and_S() — load thetao and so for a given time index or the first time.

        • get_volume() — compute grid-cell volume from uniform dx, dy, dz based on the first differences of x, y, z.

        • get_bins() — compute per-basin 2D histograms in (S, T) space: Vb, with edges Sb, Tb.

        • compute_delta(modref)() — compute filled per-basin anomalies deltaSf, deltaTf on the model-ref bins.

        • apply_anomaly(base)() — map filled anomalies back to 3D to produce S_corrected, T_corrected using the “base” fields.

      • Notable attributes: T, S, V, Vb, Sb, Tb, deltaSf, deltaTf, S_corrected, T_corrected, Nbins, Nbasins.

  • Module: i7aof.biascorr.projection

    • Projection: Orchestrates the bias-correction workflow across reference, model-reference, base (extrapolated), and future model periods; constructs continental-shelf basin masks; loops over years to write corrected outputs.

      • Key methods:

        • read_reference() — build Timeslice for reference (ref), model reference (modref), and base (base) and compute their data.

        • compute_bias() — compute salinity and temperature bin bias corrections (Sc, Tc) on the model reference bins.

        • read_model() — iterate over all model files in the future period, apply anomalies per time index, and write corrected outputs.

        • read_model_timeslice() — one-year helper used by read_model().

        • create_basin_masks() — construct per-basin masks over the continental shelf using topography and IMBIE basins.

        • plot_TS_diagrams(filename)() — quick-look T–S histograms per basin for ref/modref/bias-corrected bins.

      • Important attributes: ref, modref, base, basinNumber, basinmask, basinmask_extrap, Nbins, input filename patterns.

Required config options

Section [biascorr] is required. Keys used by the workflow:

  • Input datasets (NetCDF files or glob patterns):

    • thetao_ref, so_ref — reference dataset over the reference period (observational or reanalysis), on the ISMIP grid.

    • thetao_modref, so_modref — model over the reference period (non- extrapolated). Used to build model reference bins and anomalies.

    • thetao_base, so_base — model over the reference period with extrapolated fields, used as the base to which anomalies are applied.

    • thetao_mod, so_mod — model over the target/future period to be corrected (glob patterns; yearly or chunked files expected).

  • Ancillary inputs and parameters:

    • filename_topo — ISMIP topography file (output of i7aof.topo); used to mask out deep ocean and non-shelf regions.

    • filename_imbie — IMBIE basin mask file (output of i7aof.imbie), with variable basinNumber.

    • z_shelf (float, meters) — maximum bed elevation for “continental shelf” mask (e.g., -1500 m).

    • Nbins (int) — number of equally spaced bins for both S and T per basin.

Assumptions:

  • Input variables are named thetao and so, with dims (time, z, y, x).

  • Horizontal and vertical spacing are uniform and read from x, y, z coords. Volume currently ignores partial cell fractions (see Internals).

Outputs

  • Corrected NetCDF files written by Projection.read_model() in the current working directory, one pair per input chunk:

    • thetao_corrected_{startYear}_{endYear}.nc

    • so_corrected_{startYear}_{endYear}.nc

  • Optional: T–S diagram figure from Projection.plot_TS_diagrams(filename)().

Data model

  • Gridded variables: T(thetao), S(so), 3D volume V(z,y,x).

  • Per-basin T–S histograms: Vb(Nbasins,Nbins,Nbins) with bin edges Sb(Nbasins,Nbins+1), Tb(Nbasins,Nbins+1).

  • Filled per-basin anomalies on model-ref bins: deltaSf, deltaTf.

  • Corrected gridded fields: S_corrected(z,y,x), T_corrected(z,y,x).

  • Basin metadata: basinNumber(y,x), masks basinmask(Nbasins,y,x) and basinmask_extrap(Nbasins,y,x).

Runtime and external requirements

  • Core: numpy, xarray, tqdm, matplotlib (for plotting).

  • Inputs are expected on the ISMIP grid; see i7aof.grid, i7aof.topo, and i7aof.imbie for producing required ancillary files.

  • For the authoritative conda-forge environment, see dev-spec.txt (note pyproject.toml lists a PyPI-only subset).

Usage

High-level workflow to bias-correct a model period:

from i7aof.biascorr.projection import Projection

proj = Projection(config, logger)
proj.read_reference()   # loads ref, modref, base and computes bins/volumes
proj.compute_bias()     # derives per-basin Sc and Tc on model-ref bins
proj.read_model()       # loops over future files; writes corrected outputs

# Optional quick-look TS plots
proj.plot_TS_diagrams('ts_diagrams.png')

Notes:

  • Ensure filename_topo and filename_imbie point to files on the same ISMIP grid as the inputs.

  • Use sensible z_shelf so shelf basins contain finite volume.

Internals (for maintainers)

  • Timeslice volume and histograms:

    • Volume assumes uniform dx, dy, dz computed from the first grid steps; a TODO notes applying cell fractions in the future.

    • get_bins() builds 2D histograms in (S, T) per basin, weighting by grid cell volume masked by basinmask.

  • Anomaly computation and filling:

    • compute_delta(modref) forms volume-weighted anomalies on the model-ref bins and fills missing bins iteratively by averaging neighbor bins until full coverage.

  • Bias curves (Sc, Tc):

    • Salinity: per-basin search over scalings in [0.5, 1.5) to best match the reference S PDF while anchoring the chosen percentile (perc=99 by default).

    • Temperature: scale the (T - Tmin) distribution so the perc percentile matches reference; anchor at Tmin.

  • Apply to 3D fields: anomalies are looked up by bin indices derived from the base fields per basin and added to produce corrected fields.

  • Basin masks: create_basin_masks() reads topography and IMBIE basins; the shelf mask uses bed > z_shelf to exclude deep ocean; an “extrapolated” mask includes full basins for applying anomalies to base fields.

Edge cases / validations

  • File pairing: read_model() assumes each thetao file has a matching so file via name replacement. An assertion will fail otherwise.

  • Dimensions and names: variables must be named thetao and so and share the same time coordinate and (x,y,z) grids. The code uses xr.testing.assert_equal on time to enforce matching time axes.

  • Uniform spacing: volume assumes uniform spacing; highly nonuniform grids aren’t supported in the current implementation.

  • Binning robustness: empty bins lead to NaNs which are then filled iteratively; extreme cases could require tuning Nbins.

Extension points

  • Allow custom output paths/filenames for corrected fields instead of the current hard-coded names.

  • Incorporate partial-cell fractions and exact grid metrics in volume.

  • Support alternative binning strategies, percentiles, and scaling ranges.

  • Generalize variable names/dimension names or pass them via config.