Developer Guide: Adding a New Observation Type to the OMI Interface#

This page describes how to add a new observation type — generically called C below — to the TSMP-PDAF OMI (Observation Model Interface) framework. It covers the architecture, the role of each source file, and the changes required, using the existing GRACE-DA and SM-DA implementations as reference.

Architecture Overview#

Observation handling sits between the PDAF library and the eCLM model state in three layers:

PDAF library (src/)
    │  calls callback routines at each analysis step
    ▼
callback_obs_pdafomi.F90        ← single routing hub
    │  dispatches to one routine per observation type
    ▼
obs_GRACE_pdafomi.F90           ← one module per observation type
obs_SM_pdafomi.F90
obs_C_pdafomi.F90  (new)
    │  read observations, build H, apply H to state vector
    ▼
eCLM state vector (via enkf_clm_mod, mod_assimilation, …)

Each observation-type module is self-contained. The only coupling between modules is through the central routing file and the shared mod_assimilation module.

Files Involved#

To add a new observation type C, touch exactly eight things:

File

Action

interface/framework/obs_C_pdafomi.F90

Create (new file, modelled on the template)

interface/framework/Makefile

Add obs_C_pdafomi.o to MOD_USER_PDAFOMI

interface/framework/callback_obs_pdafomi.F90

Add calls for C to every callback routine

interface/framework/mod_assimilation.F90

Declare cradius_C, sradius_C, and any other C-specific parameters

interface/framework/mod_read_obs.F90

Add 'C' case to update_obs_type

interface/model/eclm/enkf_clm_mod_5.F90

Add clmupdate_C flag (parallel to clmupdate_tws, clmupdate_swc)

interface/framework/init_pdaf.F90

Import assim_C and set it from clmupdate_C

interface/framework/init_pdaf_parse.F90

Import rms_obs_C and add a parse call for it

The template at templates/omi/obs_OBSTYPE_pdafomi_TEMPLATE.F90 is a good starting skeleton. The existing SM and GRACE modules show how the template is applied in this codebase.


Step 1 — Create obs_C_pdafomi.F90#

Module-level variables#

At minimum, declare assim_C (a logical switch controlled by callback_obs_pdafomi), rms_obs_C (a constant fallback error standard deviation), and the PDAFomi types thisobs (of type obs_f, thread-safe) and thisobs_l (of type obs_l, declared THREADPRIVATE).

assim_C and rms_obs_C are wired up externally:

  • init_pdaf.F90 imports assim_C and sets it from clmupdate_C (i.e. assim_C = (clmupdate_C /= 0)), so the observation type is enabled automatically when the corresponding model-state update flag is active. Commented-out placeholders mark the insertion point.

  • init_pdaf_parse.F90 imports rms_obs_C and populates it via a parse call that reads rms_obs_C from enkfpf.par. Commented-out placeholders mark the insertion point here as well.

Additional module-level allocatable arrays — e.g. a cached climatological mean needed for an anomaly operator — follow the same pattern as in obs_GRACE_pdafomi.F90.

The obs_f type#

obs_f carries all observation metadata and is populated inside init_dim_obs_C. Mandatory fields:

Field

Type

Meaning

thisobs%doassim

INTEGER

1 = assimilate, 0 = skip

thisobs%disttype

INTEGER

distance metric for localization (see below)

thisobs%ncoord

INTEGER

number of spatial coordinates (usually 2)

thisobs%id_obs_p(:,:)

INTEGER, ALLOCATABLE

state-vector index for each process-local observation

Frequently useful optional fields:

  • icoeff_p: bilinear interpolation weights

  • obs_err_type: 0 = Gaussian, 1 = Laplace

  • inno_omit: innovation outlier rejection threshold

disttype choices:

  • 0 — Cartesian distance; coordinates in any consistent unit (GRACE uses integer grid-cell indices, so cradius_GRACE is in number of grid cells).

  • 3 — geographic distance; coordinates in degrees, radii in km (used by SM).

Use disttype=3 for point observations distributed geographically (C, SM) and disttype=0 for products already mapped to grid-cell indices (e.g. GRACE).


Routine 1 — init_dim_obs_C#

Called once per analysis step before the filter loop. Responsibilities:

  1. Set the mandatory thisobs fields.

  2. Read the global observation vector from the NetCDF file using read_obs_nc_type from mod_read_obs.

  3. Handle the dim_obs == 0 case (no observations scheduled for this step): allocate dummy arrays of size 1 and call PDAFomi_gather_obs before returning.

  4. For each global observation that falls within the PE-local domain, record the state-vector index in thisobs%id_obs_p and populate the local arrays obs_p, ivar_obs_p, and ocoord_p.

  5. Call PDAFomi_gather_obs to finalize observation bookkeeping across MPI ranks.

Reading observations: Both SM and GRACE use read_obs_nc_type, which checks the type_clm NetCDF variable and returns dim_obs = 0 if the file does not contain observations of the requested type.

Who reads the file: SM reads only on mype_filter==0 and broadcasts; GRACE reads on all filter processes because the spatial averaging operator needs each PE to know its contribution. For point observations like C, the SM pattern (read + broadcast) is simpler.

Snapping observations to the state vector: For point observations (SM, C), loop over all global observations and all PE-local gridcells and record the state-vector index for matching pairs. For GRACE the logic is inverted: each observation aggregates multiple gridcells, so the loop iterates over gridcells and maps each one to the observation it contributes to.


Routine 2 — obs_op_C#

Applies the observation operator H, mapping the PE-local model state to the observation space. For a simple point observation (SM pattern): allocate a local output array, extract state_p(obs_index_p(i)) for each local observation, then call PDAFomi_gather_obsstate to assemble the global result in ostate.

For aggregating operators (GRACE pattern), sum all gridcell contributions mapping to the same observation, apply any transformation (e.g. subtract temporal mean for TWS anomaly), and then call PDAFomi_gather_obsstate. Follow the SM pattern for C unless your observations represent spatial averages over a large footprint.


Routines 3 & 4 — Localization (required for LESTKF/LETKF)#

init_dim_obs_l_C counts observations within the localization radius of the current local analysis domain. It is a thin wrapper around PDAFomi_init_dim_obs_l, passing thisobs_l, thisobs, the local domain coordinates coords_l (in the same units and disttype as ocoord_p), and the localization parameters cradius_C and sradius_C.

localize_covar_C is required only for the LEnKF. It is an equally thin wrapper around PDAFomi_localize_covar with the same localization parameters.


Step 2 — Update callback_obs_pdafomi.F90#

This file dispatches PDAF callbacks to each observation module. Add C in the same pattern as GRACE and SM. The commented-out !USE obs_C_pdafomi placeholders show exactly where to insert new entries.

For each of the eight callback subroutines, add a USE obs_C_pdafomi statement and one call to the corresponding C routine:

  • init_dim_obs_pdafomi: set assim_C = .true., call init_dim_obs_C, and add dim_obs_C to the running total. The assim_* flags exist to disable an observation type for debugging; keep assim_C = .true. here because the module itself returns dim_obs = 0 when no observations are present. (This logic is planned to be complemented in the future by setting assim_C from the EnKF input file)

  • obs_op_pdafomi: call obs_op_C. Call order does not affect correctness because offsets in ostate are determined by the order of init_dim_obs_* calls above.

  • init_dim_obs_l_pdafomi: call init_dim_obs_l_C. The variable dim_obs_l is incremented inside each call; PDAF initializes it to 0 before invoking the callback.

  • localize_covar_pdafomi: call localize_covar_C.

  • Remaining callbacks (add_obs_err, init_obscovar, prodRinvA, prodRinvA_l, deallocate_obs): each needs a corresponding C call, all of which are thin wrappers around the matching PDAFomi generic routine, identical in structure to the existing GRACE and SM entries.


Step 3 — Add Parameters to mod_assimilation.F90#

Declare cradius_C and sradius_C next to the existing GRACE and SM radius parameters. These are read from the enkfpf.par input file via the existing namelist mechanism; add corresponding entries to the [DA] section parser in the same module.


Step 4 — Register the New Type in mod_read_obs.F90#

The routine update_obs_type reads the type_clm string from the observation file and sets the clmupdate_* flags for the next analysis cycle. Add a 'C' case that sets the appropriate flags. If no existing clmupdate_* flag covers your new variable, declare a new clmupdate_C in enkf_clm_mod.F90 and wire it to the state-vector fill and scatter logic in the eCLM interface routines (eclm/).


Key Design Differences Between GRACE-DA and SM-DA#

Localization distance units#

Type

disttype

Coordinate arrays

Radius unit

GRACE

0 (Cartesian)

integer grid-cell indices

grid cells

SM

3 (haversine)

degrees longitude/latitude

km

Observation operator complexity#

  • SM: direct extraction from state_p using a precomputed index.

  • GRACE: aggregating — all gridcell values within the GRACE footprint are summed and averaged, then the temporal mean is subtracted to produce a TWS anomaly. thisobs%id_obs_p(1, g) maps each gridcell g to its observation (inverse of the usual convention).

Follow the SM pattern for C point observations; follow the GRACE pattern if observations represent spatial averages over a large footprint.

Coverage checks and error specification#

GRACE discards observations where fewer than a minimum number of gridcells lie within the support radius (important near coastlines). SM and C point observations do not need this, but consider rejecting observations over water bodies or outside the CLM domain.

Error specification is controlled by multierr: constant (multierr=0), per-observation from obserr_clm (multierr=1), or full covariance matrix from obscov_clm (multierr=2).


Assimilating Multiple Observation Types at the Same Timestep#

The framework generates a state vector for each type individually before the assimilation, some things would need to be adapted when mutliple observation types are assimilated at the same timestep. Currently, one observation file only consists of one observation type. As SM observations are usually assimilated at noon and GRACE observations are assimilated at the end of the month at midnight, this should not provide any problems.