i7aof.extrap¶
Purpose: Orchestrate horizontal and vertical extrapolation of CMIP-derived ct/sa to the ISMIP grid using external Fortran executables.
Public Python API (by module)¶
Module:
i7aof.extrapload_template_text(): Load the combined horizontal/vertical namelist template text.
Module:
i7aof.extrap.cmipextrap_cmip(): Orchestrate per-file, per-variable extrapolation in time chunks with optional parallel workers and per-chunk logs.Post-extrap conservative resampling (z_extrap → z) is performed using a Zarr-first, append-by-time workflow implemented in
i7aof.extrap.shared.
Required config options¶
[workdir] base_dir— required unlessworkdirarg is provided.[cmip_dataset]lon_var,lat_var: variable/dimension names on input
[extrap_cmip]time_chunk: int; time chunk size for extrapolation.num_workers: int or ‘auto’/’0’; controls process parallelism.time_chunk_resample: int; time chunk size for post-extrap vertical resampling to z-levels (Zarr append chunk length).
Outputs¶
Per-variable vertically extrapolated monthly files under:
extrap/{model}/{scenario}/Omon/ct_sa/*ismip<res>_extrap.ncPer-chunk intermediates and logs under
*_tmp/next to the final output:input_<i0>_<i1>.nc(prepared inputs)horizontal_<i0>_<i1>.nc,vertical_<i0>_<i1>.nclogs/<var>_t<i0>-<i1>.logcontaining Fortran output and Python tracebacks
Post-extrap conservative resampled outputs (on
zlevels) are written next to the extrapolated file with the ISMIP resolution component updated fromismip<hres>_<dz_extrap>toismip<hres>_<dz>. If the resolution string is unchanged, a_zsuffix is used, e.g.,..._extrap_z.nc. A temporary<basename>.zarr/store is created during resampling and removed after the final NetCDF is written.
Data model¶
Ensures
x,ycoordinates and retains only required variables for the Fortran tools (target variable, time, x, y, and z/z_extrap).Final concatenation injects ISMIP grid coordinates and related variables.
Runtime and external requirements¶
Core:
xarray,numpy,dask(scheduler control),mpas-tools(config/logging).Tools: Fortran executables
i7aof_extrap_horizontalandi7aof_extrap_vertical.Environment:
HDF5_USE_FILE_LOCKING=FALSEset by default; OMP/BLAS/MKL threads set to 1 per worker.stdbufused for unbuffered Fortran output when available.
Usage¶
from i7aof.extrap.cmip import extrap_cmip
extrap_cmip(
model='CESM2-WACCM',
scenario='historical',
user_config_filename='my-config.cfg',
num_workers='auto', # or an int
)
CLI
ismip7-antarctic-extrap-cmip \
--model CESM2-WACCM \
--scenario historical \
--config my-config.cfg \
--num_workers auto
Internals (for maintainers)¶
Time chunking computed from source metadata; chunks run serially or in a process pool (
spawnstart method). Each chunk writes a per-chunk input with Daskscheduler='synchronous'for safer HDF5 writes, then runs horizontal and vertical Fortran steps with unbuffered stdout/stderr captured to the same log.Worker failures raise a
ChunkFailed(i0, i1, log_path, message)that the parent logs verbosely before cancelling outstanding futures. Pool crashes log completed vs pending chunk indices and point to the logs directory.Finalization concatenates vertical outputs and injects grid coordinates/vars.
Post-extrap vertical resampling (z_extrap → z)¶
Strategy: open the extrapolated NetCDF lazily (optionally chunked by time), iterate over time chunks of length
[extrap_cmip] time_chunk_resample, applyi7aof.vert.resamp.VerticalResamplerto conservatively map intensive fields ontoz, and append each chunk to a Zarr store usingappend_dim='time'when applicable. After all chunks, open the Zarr store once, preserve per-variable chunk encodings, write a single NetCDF, and delete the Zarr store.Output variable dimension order is enforced to
(time?, z, y, x).
Edge cases / validations¶
Missing required dims (
x,y) or target variable raise clear errors.Grid/field dimension mismatches raise
ValueErrorin preparation phase.If final output exists, the file is skipped entirely. If a chunk’s vertical output exists, it won’t be recomputed (idempotent per-chunk work).
If the final resampled NetCDF already exists, resampling is skipped.
Extension points¶
Add support for additional variables or alternate Fortran executables.
Provide alternative scheduling strategies (e.g., per-node pools) or adaptive chunk sizing.
Make faulthandler and logging verbosity configurable.