Data Standardization with CMOR via Python CDO#

This series introduces you to the functions of cdo cmor.

cdo cmor calls the CMOR library to standardize climate model output for different projects.

Why refining data?#

Part of a sustainable scientifc workflow is the refinement of the produced data to make it FAIR. FAIR research data is Interoperable and Reusable. Adopting a data standard helps to make data interoperable and reusable in many ways.

Interoperability:

  • Use of a widely accepted common data format across all generated data such as NetCDF

  • The inherited compliance to domain specific conventions like CF simplifies processing.

Reusability

  • Data that is stored with sufficient meta data according to the data standard is analyzable without external information and therefore self-descriptive.

    • E.g. statistical operations and interpolations over space and time are only processable if all temporal and spatial information including cells and interval bounds and a full description of the vertical axis is available in the input file.

  • Continoulsy developped applications aim at being compatible to accepted data standards ensuring a long-term usability

  • Definition of a Data Reference Syntax including templates for storage pathes and file names allows to identify a single file by the set of specified project attributes

The scope#

For (very) large climate community projects like the Coupled Model Intercomparison Project (CMIP), systematic analysis across models only easy to do if model output is provided as FAIR data. In CMIP6, 2000 unique variables are defined and which can be submitted for 100 different experiments.

Approaches#

Two different approaches are possible:

  • Model output adaptation

  • Post-processing with specialized tools

Reasons against model output adaptation

  • Data standards are evolving which require continous updates on output writing

  • Once adapted to a specific standard, the output is inflexible for new and other standards

  • Conservative scientists using stone-age but proofed software will be hard to convince to switch to a new output format

  • Since rarely data standard experts work on the adaptation, the task is time consuming and error-prone

Reasons for Post-processing with specialized tools

  • Developer specialization: experts on model development (post-processing) can focus on model development (post-processing) goals

  • Guarantee: standardizing software ensures data standard generation i.e. no flaws.

  • Compatibility: Other older tools remain compatible with the original model raw output

  • Flexibility: Enabling of quick adaptation of other data standards

Definitions#

  • CMIP6

    • The recent phase 6 of the Coupled Model Intercomparison Project

  • CMIP Data Standard

    • Convetion on climate data accepted in CMIP

  • CMOR

    • the Climate Model Output Rewriter] can generate data compliant to the CMIP Data Standard.

  • CDO

    • Collection of operators to process climate data.

    • The python binding is a wrapper to call a specific binary correctly

CMOR#

The Climate Model Output Rewriter tool can generate data compliant to the CMIP Data Standard.

Features#

  • Different (CMIP-like) data standards can be produced

    • No user side preparation of data standard description

  • CMOR ensures that output is conform to the data standard. Building upon CMOR means using synergies which

    • avoids repeating work

    • helps to concentrated on the actual goal instead of debugging own cmor-lite developments

Note

CMIP-like data standard means:

Each file must

  • contain only a single output data variable

  • cover only a single simulation

  • include coordinates and additional meta data

Why integrating CMOR into CDOs?#

CDO

  • is widely used and accepted

  • has an active support by both users and developers

  • has an interface that allows

    • different infile formats

    • to access to all infile information no matter how structured

  • is fast because it is written in C++

Installation of CDO with CMOR#

If you work on DKRZ HPC system, we recommend to work versions installed here:

ls -1 /work/bm0021/cdo_incl_cmor/

The older CMOR Version 2 is used to generate CMIP5 and CORDEX-CMIP5 data standard. Due to a design change in CMOR functions, there is up- and downward incompatibility of CMOR input data.

The interface of cdo cmor and the format of user input does not change with the installed CMOR versions. Scripts and files used for one project can be the starting point for the next project.

Installation with conda#

Only the recent CMOR3 version can be installed and linked to CDO via conda:

conda update conda
conda create --name cdocmorenv conda-forge/label/dev::cdo -c conda-forge
source activate ${cdoenv}

The environment for CDO with cmor shall be cdocmorenv set by --name cdocmorenv. CDO will be installed from the develop-channel which contains CMOR by specifying conda-forge/label/dev::cdo. All other packages come from the conda-forge channel with -c conda-forge.

Updating a conda installation of cdo with cmor:#

Depending on which additional packages you have installed, you may have to lower the channel_priority first.

conda config --set channel_priority flexible
conda install --name ${cdoenv} conda-forge/label/dev::cdo -c conda-forge

Note

Debian CDO (sudo apt-get install cdo) is installed without CMOR

1. Preparation.#

Option A: On Levante#

Define vars for CDO and working directories

#- Recent path:
import os
pwd=os.getcwd()
#
workdir="/work/bm0021/cdo_incl_cmor/examples/"
cdodir="/work/bm0021/cdo_incl_cmor/"
#
cdocmorinfo=workdir+".cdocmorinfo"

Option B: Local PC#

Clone repo for material:

!git clone https://gitlab.dkrz.de/dicad-pp/cdo-incl-cmor.git
Cloning into 'cdo-incl-cmor'...
remote: Enumerating objects: 877, done.
Receiving objects:   0% (1/877)   
Receiving objects:   1% (9/877)   
Receiving objects:   2% (18/877)   
Receiving objects:   3% (27/877)   
Receiving objects:   4% (36/877)   
Receiving objects:   5% (44/877)   
Receiving objects:   6% (53/877)   
Receiving objects:   7% (62/877)   
Receiving objects:   8% (71/877)   
Receiving objects:   9% (79/877)   
Receiving objects:  10% (88/877)   
Receiving objects:  11% (97/877)   
Receiving objects:  12% (106/877)   
Receiving objects:  13% (115/877)   
Receiving objects:  14% (123/877)   
Receiving objects:  15% (132/877)   
Receiving objects:  16% (141/877)   
Receiving objects:  17% (150/877)   
Receiving objects:  18% (158/877)   
Receiving objects:  19% (167/877)   
Receiving objects:  20% (176/877)   
Receiving objects:  21% (185/877)   
Receiving objects:  22% (193/877)   
Receiving objects:  23% (202/877)   
Receiving objects:  24% (211/877)   
Receiving objects:  25% (220/877)   
Receiving objects:  26% (229/877)   
Receiving objects:  27% (237/877)   
Receiving objects:  28% (246/877)   
Receiving objects:  29% (255/877)   
Receiving objects:  30% (264/877)   
Receiving objects:  31% (272/877)   
Receiving objects:  32% (281/877)   
Receiving objects:  33% (290/877)   
Receiving objects:  34% (299/877)   
Receiving objects:  35% (307/877)   
Receiving objects:  36% (316/877)   
Receiving objects:  37% (325/877)   
Receiving objects:  38% (334/877)   
Receiving objects:  39% (343/877)   
Receiving objects:  40% (351/877)   
Receiving objects:  41% (360/877)   
Receiving objects:  42% (369/877)   
Receiving objects:  43% (378/877)   
Receiving objects:  44% (386/877)   
Receiving objects:  45% (395/877)   
Receiving objects:  46% (404/877)   
Receiving objects:  47% (413/877)   
Receiving objects:  48% (421/877)   
Receiving objects:  49% (430/877)   
Receiving objects:  50% (439/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  51% (448/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  52% (457/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  53% (465/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  54% (474/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  55% (483/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  56% (492/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  57% (500/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  58% (509/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  59% (518/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  60% (527/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  61% (535/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  62% (544/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  63% (553/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  64% (562/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  65% (571/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  66% (579/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  67% (588/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  68% (597/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  69% (606/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  70% (614/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  71% (623/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  72% (632/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  73% (641/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  74% (649/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  75% (658/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  76% (667/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  77% (676/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  78% (685/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  79% (693/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  80% (702/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  81% (711/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  82% (720/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  83% (728/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  84% (737/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  85% (746/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  86% (755/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  87% (763/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  88% (772/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  89% (781/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  90% (790/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  91% (799/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  92% (807/877), 17.14 MiB | 34.26 MiB/s   
remote: Total 877 (delta 0), reused 0 (delta 0), pack-reused 877
Receiving objects:  93% (816/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  94% (825/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  95% (834/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  96% (842/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  97% (851/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  98% (860/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects:  99% (869/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects: 100% (877/877), 17.14 MiB | 34.26 MiB/s   
Receiving objects: 100% (877/877), 26.04 MiB | 34.26 MiB/s, done.
Resolving deltas:   0% (0/401)   
Resolving deltas:   2% (10/401)   
Resolving deltas:   3% (16/401)   
Resolving deltas:   5% (23/401)   
Resolving deltas:   8% (33/401)   
Resolving deltas:   9% (37/401)   
Resolving deltas:  11% (47/401)   
Resolving deltas:  12% (52/401)   
Resolving deltas:  14% (58/401)   
Resolving deltas:  17% (71/401)   
Resolving deltas:  19% (78/401)   
Resolving deltas:  22% (89/401)   
Resolving deltas:  23% (96/401)   
Resolving deltas:  27% (110/401)   
Resolving deltas:  29% (117/401)   
Resolving deltas:  30% (122/401)   
Resolving deltas:  37% (149/401)   
Resolving deltas:  39% (158/401)   
Resolving deltas:  40% (161/401)   
Resolving deltas:  42% (170/401)   
Resolving deltas:  44% (180/401)   
Resolving deltas:  45% (182/401)   
Resolving deltas:  46% (186/401)   
Resolving deltas:  47% (189/401)   
Resolving deltas:  48% (193/401)   
Resolving deltas:  51% (205/401)   
Resolving deltas:  52% (210/401)   
Resolving deltas:  53% (213/401)   
Resolving deltas:  60% (242/401)   
Resolving deltas:  61% (246/401)   
Resolving deltas:  62% (249/401)   
Resolving deltas:  65% (261/401)   
Resolving deltas:  66% (265/401)   
Resolving deltas:  69% (278/401)   
Resolving deltas:  70% (284/401)   
Resolving deltas:  71% (285/401)   
Resolving deltas:  72% (289/401)   
Resolving deltas:  80% (321/401)   
Resolving deltas:  82% (330/401)   
Resolving deltas:  85% (344/401)   
Resolving deltas:  86% (346/401)   
Resolving deltas:  88% (353/401)   
Resolving deltas:  89% (357/401)   
Resolving deltas:  91% (366/401)   
Resolving deltas:  93% (374/401)   
Resolving deltas:  94% (377/401)   
Resolving deltas:  95% (381/401)   
Resolving deltas:  96% (386/401)   
Resolving deltas:  97% (392/401)   
Resolving deltas:  98% (393/401)   
Resolving deltas: 100% (401/401)   
Resolving deltas: 100% (401/401), done.
Checking connectivity... done.
#- Recent path:
import os
pwd=os.getcwd()
#
basedir=pwd+"/cdo-incl-cmor"
workdir=basedir+"/application/handson/"
cdocmorinfo=pwd+"/.cdocmorinfo"

2. Set-up cdo in python#

#set cdo binary to the one installed in the environment of the kernel
import sys
import os
cdobin="/".join(sys.executable.split(os.path.sep)[:-1])+"/cdo"
#
#import python cdo 
from cdo import *
cdo = Cdo(cdobin)
cdo.debug=True
#This prohibits that existing files are created a second time
cdo.forceOutput = False
help(cdo)
Help on Cdo in module cdo.cdo:

<cdo.cdo.Cdo object>

3. Interface#

%%capture --no-stdout
cdo.cmor(options="-h")
Found operator:cmor
# DEBUG - start =============================================================
CALL  :/envs/bin/cdo -O -s -h -cmor
STDOUT:
STDERR:
------------------------------------------------------------------------------------------------------------------------
  Usage : cdo  [Options]  Operator1  [-Operator2  [-OperatorN]]
------------------------------------------------------------------------------------------------------------------------

=== Options ============================================================================================================
    -a, --absolute_taxis                      Generate an absolute time axis.
        --argument_groups 
        --attribs  <arbitrary|filesOnly|onlyFirst|noOutput|obase> 
                                              Lists all operators with choosen features or the attributes of given operator(s)
                                              operator name or a combination of [arbitrary,filesOnly,onlyFirst,noOutput,obase].
    -S, --cdo_diagnostic                      Create an extra output stream for the module TIMSTAT. This stream
                                              contains the number of non missing values for each output period.
        --cellsearchmethod  <spherepart|latbins> 
                                              Sets the cell search method.
    -c, --check_data_range                    Enables checks for data overflow.
        --chunksize  <size>                   NetCDF4 chunk size.
    -k, --chunktype  <auto|grid|lines>        NetCDF4 chunk type: auto, grid or lines.
        --cmor                                CMOR conform NetCDF output.
    -C, --color  <auto|no|all>                Set behaviour of colorized output messages.
    -Z, --compress                            Enables compression. Default = SZIP
    -z, --compression_type  <aec|jpeg|zip[_1-9]|zstd[1-19]> 
                                              aec         AEC compression of GRIB2 records
                                              jpeg        JPEG compression of GRIB2 records
                                              zip[_1-9]   Deflate compression of NetCDF4 variables
                                              zstd[_1-19] Zstandard compression of NetCDF4 variables
        --config  <all|all-json|<specific_feature_name>> 
                                              Prints all features and the enabled status.
                                              Use option <all> to see explicit feature names.
    -d, --debug                               Pring all available debug messages
    -b, --default_datatype  <nbits>           Set the number of bits for the output precision
                                                  I8|I16|I32|F32|F64     for nc1,nc2,nc4,nc4c,nc5,nczarr;
                                                  U8|U16|U32             for nc4,nc4c,nc5;
                                                  F32|F64                for grb2,srv,ext,ieg;
                                                  P1 - P24               for grb1,grb2
        --disable_filesuffix  <true|false>    This option is generated from CDO_DISABLE_FILESUFFIX and will overwrite it.
                                              See help of corresponding environment variable.
        --disable_history  <true|false>       This option is generated from CDO_DISABLE_HISTORY and will overwrite it.
                                              See help of corresponding environment variable.
    -w, --disable_warnings                    Disable warning messages.
        --double                              Using double precision floats for data in memory.
        --download_path  <path>               This option is generated from CDO_DOWNLOAD_PATH and will overwrite it.
                                              See help of corresponding environment variable.
    -A, --dryrun                              Dry run that shows processed CDO call.
        --eccodes                             Use ecCodes to decode/encode GRIB1 messages.
        --enableexcept  <except>              Set individual floating-point traps 
                                              (DIVBYZERO, INEXACT, INVALID, OVERFLOW, UNDERFLOW, ALL_EXCEPT)
        --envvars                             Prints the environment variables of CDO.
        --file_suffix  <suffix>               This option is generated from CDO_FILE_SUFFIX and will overwrite it.
                                              See help of corresponding environment variable.
    -F, --filter  <filterId,params>           NetCDF4/HDF5 filter description.
        --float                               Using single precision floats for data in memory.
    -f, --format  <grb1|grb2|nc1|nc2|nc4|nc4c|nc5|nczarr|srv|ext|ieg> 
                                              Format of the output file.
    -g, --grid  <grid>                        Set default grid name or file. Available grids: 
                                              F<XXX>, t<RES>, tl<RES>, r<NX>x<NY>, global_<DXY>, zonal_<DY>, gme<NI>, lon=<LON>/lat=<LAT>
        --gridsearchradius  <degrees[0..180]> 
                                              Sets the grid search radius (0-180 deg).
    -M, --has_missval                         Set HAS_MISSVAL to true.
    -h, --help  <operator>                    Shows either help information for the given operator or the usage of CDO.
        --history                             Do append to NetCDF "history" global attribute.
        --history_info  <true|false>          This option is generated from CDO_HISTORY_INFO and will overwrite it.
                                              See help of corresponding environment variable.
        --icon_grids  <path>                  This option is generated from CDO_ICON_GRIDS and will overwrite it.
                                              See help of corresponding environment variable.
        --ignore_time_bounds                  Ignores time bounds for time range statistics.
    -i, --institution  <institute_name>       Sets institution name.
    -u, --interactive                         Enable CDO interactive mode.
    -L, --lock_io                             Lock IO (sequential access).
        --module_info  <module name>          Prints list of operators.
        --netcdf_hdr_pad  <nbr>               Pad NetCDF output header with nbr bytes.
        --no_history                          Do not append to NetCDF "history" global attribute.
        --no_remap_weights                    Switch off generation of remap weights.
    -P, --num_threads  <nthreads>             Set number of OpenMP threads.
        --operators                           Prints list of operators.
        --operators_no_output                 Prints all operators which produce no output.
    -O, --overwrite                           Overwrite existing output file, if checked.
        --pedantic                            Warnings count as errors.
        --percentile  <method>                Methods: nrank, nist, rtype8, <NumPy method (linear|lower|higher|nearest|...)>
        --precision  <float_digits[,double_digits]> 
                                              Precision to use in displaying floating-point data (default: 7,15).
        --reduce_dim                          Reduce NetCDF dimensions.
    -R, --regular                             Convert GRIB1 data from global reduced to regular Gaussian grid (cgribex only).
    -r, --relative_taxis                      Generate a relative time axis.
        --remap_weights  <0|1>                Generate remap weights (default: 1).
        --reset_history  <true|false>         This option is generated from CDO_RESET_HISTORY and will overwrite it.
                                              See help of corresponding environment variable.
        --rusage                              Print information about resource utilization.
    -D, --scoped_debug  <comma seperated scopes> 
                                              Multiple scopes suimultaneusly possible. Use option without arguments to get a list of possible scopes
        --seed  <seed>                        Seed for a new sequence of pseudo-random numbers. <seed> must be >= 0
    -m, --set_missval  <missval>              Set the missing value of non NetCDF files (default: -9e+33).
        --settings                            Prints the settings of CDO.
    -s, --silent                              Silent mode.
        --single                              Using single precision floats for data in memory.
    -Q, --sortname                            Alphanumeric sorting of NetCDF parameter names.
        --sortparam 
    -t, --table  <codetab>                    Set GRIB1 default parameter code table name or file (cgribex only).
                                              Predefined tables: echam4,echam5,echam6,mpiom1,ecmwf,remo,
                                                  cosmo002,cosmo201,cosmo202,cosmo203,cosmo205,cosmo250
        --test  <true|false>                  This option is generated from CDO_TEST and will overwrite it.
                                              See help of corresponding environment variable.
    -T, --timer                               Enable timer.
        --timestat_date  <srcdate>            Target timestamp (temporal statistics): 
                                              first, middle, midhigh or last source timestep.
        --use_fftw  <true|false>              Sets fftw usage.
        --use_time_bounds                     Enables use of timebounds.
    -v, --verbose                             Print extra details for some operators.
    -V, --version                             Print the version number.
        --version_info  <true|false>          This option is generated from CDO_VERSION_INFO and will overwrite it.
                                              See help of corresponding environment variable.
        --worker  <num>                       Number of worker to decode/decompress GRIB records.
    -l, --zaxis  <zaxis>                      Set default zaxis name or file.
------------------------------------------------------------------------------------------------------------------------
=== Environment Variables ==============================================================================================
    CDO_CORESIZE <max. core dump size>        The largest size (in bytes) core file that may be created.
    CDO_DISABLE_FILESUFFIX <true|false>       MISSING HELP
    CDO_DISABLE_HISTORY <true|false>          MISSING HELP
    CDO_DOWNLOAD_PATH <path>                  Path where CDO can store downloads.
    CDO_FILE_SUFFIX <suffix>                  Default filename suffix.
    CDO_HISTORY_INFO <true|false>             'false' don't write information to the global history attribute [default: true].
    CDO_ICON_GRIDS <path>                     Root directory of the installed ICON grids (e.g. /pool/data/ICON).
    CDO_RESET_HISTORY <true|false>            'true' resets the global history attribute [default: false].
    CDO_TEST <true|false>                     'true' test new features [default: false].
    CDO_VERSION_INFO <true|false>             'false' disables the global NetCDF attribute CDO [default: true].
------------------------------------------------------------------------------------------------------------------------

  Operators:
    Use option --operators for a list of all operators.

  CDO version 2.2.0, Copyright (C) 2002-2023 MPI für Meteorologie
  This is free software and comes with ABSOLUTELY NO WARRANTY
  Report bugs to <https://mpimet.mpg.de/cdo>

# DEBUG - end ===============================================================
RETURNCODE:0

The operator requires one parameter and one argument. The first parameter is always the MIP-table. The argument is the input file.

Project data standard#

The project data standard is build up by 4 different type of documents:

  • The Data Request (Dreq): A data standard will only be defined for variables that are requested for and by the project, e.g. CMIP6

  • Output requirements (OR): Technical specifications for the structure, content and format of files, e.g. CMIP6

  • Global attributes (GA): Specifications for required and optional global attributes, e.g. CMIP6

  • A registry: Only names of institutions and ESMs that are registred are valid values of global attributes like institution or source. E.g. CMIP6

Controlled Vocabularies (CVs) and MIP-Tables#

DReq, GAs and the registry are translated into controlled vocabularies, CVs. This set is also called MIP-Tables, e.g. CIMP6.

  • One CV-MIP-Table contains a condensed form of all CVs which are version controlled in the registry. It contains

    • required and optional CMIP attributes

    • allowed values for attributes

    • restrictions resulting from a setting of attributes (e.g. min. simulation years of an experiment)

    • whether additional attributes must be specified (e.g. parent attributes)

  • All other MIP-Tables contain variable information

Tip

The MIP-Tables are input for CMOR. Therefore, it is guaranted that CMOR output is CMIP compliant as it also implements the OR specifications.

A variable can be requested for different frequencies, dimensions or cell_methods. E.g., it can be reasonable to provide data on model level for reuse in ESMs while having another version of the data on pressure levels for easy analysis.

MIP-tables are divided by their variables’

  • realm

  • frequencies

  • grid and vertical axis types

  • time cell method.

so that a variable only occur once in the MIP-table. Also, this division is made to keep them short.

In CMIP6, the MIP-table name is constructed by a Prefix, Frequency, Suffix and a Qualifier. However, neither all of these four parts need to be included in MIP-table name nor all of the possible combinations exists as a MIP-Table.

For this notebook, we working with the example on CMIP6. You can clone the MIP-Tables repository yourself or use the submodule inside the workshop material.

%%bash
#!git clone https://github.com/PCMDI/cmip6-cmor-tables.git {mip_tables_dir}
rel_mip_tables_dir=configuration/cmip6/cmip6-cmor-tables/
cd $(pwd)/cdo-incl-cmor 
git submodule init ${rel_mip_tables_dir} 
git submodule update ${rel_mip_tables_dir} 
cd ${rel_mip_tables_dir} 
git checkout --track origin/01.00.31
Submodule 'configuration/cmip6/cmip6-cmor-tables' (https://github.com/PCMDI/cmip6-cmor-tables.git) registered for path 'configuration/cmip6/cmip6-cmor-tables'
Cloning into 'configuration/cmip6/cmip6-cmor-tables'...
Submodule path 'configuration/cmip6/cmip6-cmor-tables': checked out '9f0ed59b7575331c0c25320cfa8bb7f0b722a2d6'
Previous HEAD position was 9f0ed59... Merge pull request #255 from PCMDI/cmor_3.5.0
Switched to a new branch '01.00.31'
Branch 01.00.31 set up to track remote branch 01.00.31 from origin.

We can parse the tables with the json package. The Amon MIP-Table contains a Header, and variable_entries.

mip_tables_dir=basedir+"/configuration/cmip6/cmip6-cmor-tables/"
import json
with open(mip_tables_dir+"/Tables/CMIP6_Amon.json") as f:
    amon=json.load(f)
print(amon.keys())
print(amon["Header"].keys())
print(amon["variable_entry"].keys())
dict_keys(['Header', 'variable_entry'])
dict_keys(['data_specs_version', 'cmor_version', 'table_id', 'realm', 'table_date', 'missing_value', 'int_missing_value', 'product', 'approx_interval', 'generic_levels', 'mip_era', 'Conventions'])
dict_keys(['ccb', 'cct', 'cfc113global', 'cfc11global', 'cfc12global', 'ch4', 'ch4Clim', 'ch4global', 'ch4globalClim', 'ci', 'cl', 'cli', 'clivi', 'clt', 'clw', 'clwvi', 'co2', 'co2Clim', 'co2mass', 'co2massClim', 'evspsbl', 'fco2antt', 'fco2fos', 'fco2nat', 'hcfc22global', 'hfls', 'hfss', 'hur', 'hurs', 'hus', 'huss', 'mc', 'n2o', 'n2oClim', 'n2oglobal', 'n2oglobalClim', 'o3', 'o3Clim', 'pfull', 'phalf', 'pr', 'prc', 'prsn', 'prw', 'ps', 'psl', 'rlds', 'rldscs', 'rlus', 'rlut', 'rlutcs', 'rsds', 'rsdscs', 'rsdt', 'rsus', 'rsuscs', 'rsut', 'rsutcs', 'rtmt', 'sbl', 'sci', 'sfcWind', 'ta', 'tas', 'tasmax', 'tasmin', 'tauu', 'tauv', 'ts', 'ua', 'uas', 'va', 'vas', 'wap', 'zg'])
#%%capture --no-stdout
#Standardize all variables of example_interface.nc for CMIP6_Amon.json:
infotabledir="cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/"
cdocmorinfos=[infotabledir+k 
              for k in ["dkrz_atts",
                        "historical_atts",
                        "mpi-esm1-2-lr_atts",
                        "cdocmorcontrol_atts",
                        "member_atts",
                        "nominalresolution_atts"]
             ]
cdocmorinfostring=','.join(cdocmorinfos)
cdo.cmor(mip_tables_dir+'/Tables/CMIP6_Amon.json,'
         'i='+cdocmorinfostring,
         input=workdir+'/example_interface.nc',
         options="-v")
# DEBUG - start =============================================================
CALL  :/envs/bin/cdo -O -s -v -cmor,/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/configuration/cmip6/cmip6-cmor-tables//Tables/CMIP6_Amon.json,i=cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/dkrz_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/historical_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/mpi-esm1-2-lr_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/cdocmorcontrol_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/member_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/nominalresolution_atts /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson//example_interface.nc
STDOUT:
STDERR:
 OpenMP:  num_procs=8  max_threads=1

cdo    cmor (Abort): CMOR support not compiled in!

# DEBUG - end ===============================================================
RETURNCODE:1
Error in calling operator cmor with:
>>> /envs/bin/cdo -O -s -v -cmor,/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/configuration/cmip6/cmip6-cmor-tables//Tables/CMIP6_Amon.json,i=cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/dkrz_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/historical_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/mpi-esm1-2-lr_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/cdocmorcontrol_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/member_atts,cdo-incl-cmor/configuration/cmip6/cmip6-cdocmorinfo/nominalresolution_atts /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson//example_interface.nc<<<
STDOUT:
STDERR: OpenMP:  num_procs=8  max_threads=1

cdo    cmor (Abort): CMOR support not compiled in!
---------------------------------------------------------------------------
CDOException                              Traceback (most recent call last)
Cell In[10], line 13
      4 cdocmorinfos=[infotabledir+k 
      5               for k in ["dkrz_atts",
      6                         "historical_atts",
   (...)
     10                         "nominalresolution_atts"]
     11              ]
     12 cdocmorinfostring=','.join(cdocmorinfos)
---> 13 cdo.cmor(mip_tables_dir+'/Tables/CMIP6_Amon.json,'
     14          'i='+cdocmorinfostring,
     15          input=workdir+'/example_interface.nc',
     16          options="-v")

File /envs/lib/python3.11/site-packages/cdo/cdo.py:505, in Cdo.__call__(self, *args, **kwargs)
    503             return None
    504         else:
--> 505             raise CDOException(**retvals)
    506 else:
    507     if kwargs["force"] or \
    508        (kwargs.__contains__("output") and not os.path.isfile(kwargs["output"])):

CDOException: (returncode:1)  OpenMP:  num_procs=8  max_threads=1

cdo    cmor (Abort): CMOR support not compiled in!

CMOR variable#

The entry of a variable inside a MIP-Table is called cmor name of the variable.

A CMOR-variable is the unique combination of the cmor name and the corresponding MIP-table which includes the cmor name.

The data standard of the same variable is different from one MIP-Table to another. That can include all variable information from cell methods to its grid.

  • In CMIP6, the request for monthly air temperature (CMIP6_Amon.json) is different compared to daily air temperature (CMIP6_day.json)

4. cdocmorinfo#

Global attributes and operator control keywords are specified in a cdocmorinfo file.

Global attributes and the CV:#

Project dependence of attribute nomenclature:

Attribute\project CMIP6 CMIP5
MIP: activity_id project_id
Model: source_id model_id
Institute: institution_id institute_id
Ensemble member: variant_label member
Grid resolution: nominal_resolution

Experiments are registered in the CV with attached predefined attributes:

Attributes\experiment_id 1pctCO2 amip ssp585
activity_id CMIP CMIP ScenarioMIP
experiment 1 percent per year increase in CO2 AMIP update of RCP8.5 based on SSP5
sub_experiment_id none none none
parent_activity_id CMIP no parent CMIP
parent_experiment_id piControl no parent historical
%%capture --no-stdout
#Since there is 
#1. a default for i which is '.cdocmorinfo'
#2. the attribute MIP_table_dir specified in cdocmorinfo,
#We only need to copy cdocmorinfo to our pwd:
#!rm {pwd}/cdocmorinfo
for c in cdocmorinfos:
    !cat {c} >>{pwd}/.cdocmorinfo
#so that it is sufficient to call:
cdo.cmor(mip_tables_dir+'/Tables/CMIP6_Amon.json',input=workdir+'example_interface.nc')
# DEBUG - start =============================================================
CALL  :/envs/bin/cdo -O -s -cmor,/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/configuration/cmip6/cmip6-cmor-tables//Tables/CMIP6_Amon.json /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_interface.nc
STDOUT:
STDERR:

cdo    cmor (Abort): CMOR support not compiled in!

# DEBUG - end ===============================================================
RETURNCODE:1
Error in calling operator cmor with:
>>> /envs/bin/cdo -O -s -cmor,/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/configuration/cmip6/cmip6-cmor-tables//Tables/CMIP6_Amon.json /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_interface.nc<<<
STDOUT:
STDERR:
cdo    cmor (Abort): CMOR support not compiled in!
---------------------------------------------------------------------------
CDOException                              Traceback (most recent call last)
Cell In[11], line 9
      7     get_ipython().system('cat {c} >>{pwd}/.cdocmorinfo')
      8 #so that it is sufficient to call:
----> 9 cdo.cmor(mip_tables_dir+'/Tables/CMIP6_Amon.json',input=workdir+'example_interface.nc')

File /envs/lib/python3.11/site-packages/cdo/cdo.py:505, in Cdo.__call__(self, *args, **kwargs)
    503             return None
    504         else:
--> 505             raise CDOException(**retvals)
    506 else:
    507     if kwargs["force"] or \
    508        (kwargs.__contains__("output") and not os.path.isfile(kwargs["output"])):

CDOException: (returncode:1) 
cdo    cmor (Abort): CMOR support not compiled in!

5. Select subset of variables#

%%capture --no-stdout
#Only process variable with cmor_name=tas
cdo.cmor('Amon,cmor_name=tas',input=workdir+'example_interface.nc')
#Same process, but with short keyword cn:
#cdo.cmor('Amon,cn=tas',      input=workdir+'examples/example_interface.nc')
# DEBUG - start =============================================================
CALL  :/envs/bin/cdo -O -s -cmor,Amon,cmor_name=tas /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_interface.nc
STDOUT:
STDERR:

cdo    cmor (Abort): CMOR support not compiled in!

# DEBUG - end ===============================================================
RETURNCODE:1
Error in calling operator cmor with:
>>> /envs/bin/cdo -O -s -cmor,Amon,cmor_name=tas /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_interface.nc<<<
STDOUT:
STDERR:
cdo    cmor (Abort): CMOR support not compiled in!
---------------------------------------------------------------------------
CDOException                              Traceback (most recent call last)
Cell In[12], line 2
      1 #Only process variable with cmor_name=tas
----> 2 cdo.cmor('Amon,cmor_name=tas',input=workdir+'example_interface.nc')
      3 #Same process, but with short keyword cn:
      4 #cdo.cmor('Amon,cn=tas',      input=workdir+'examples/example_interface.nc')

File /envs/lib/python3.11/site-packages/cdo/cdo.py:505, in Cdo.__call__(self, *args, **kwargs)
    503             return None
    504         else:
--> 505             raise CDOException(**retvals)
    506 else:
    507     if kwargs["force"] or \
    508        (kwargs.__contains__("output") and not os.path.isfile(kwargs["output"])):

CDOException: (returncode:1) 
cdo    cmor (Abort): CMOR support not compiled in!

Variable mapping#

How to map variables?

  1. Know the CMOR variable you aim to produce

  2. Link to the matching infile variable(s)

    • specify a recipe

  3. Provide attributes

Keyword Short name Value format Default
cmor_name cn Variable name included in MIP-table
name n Input variable name
code c Three digits integer. GRIB code.
units u String. Must be readable by udunits.
cell_methods cm Character (see below) m
positive p u=upward, d=downward
variable_comment vc String
%%capture --no-stdout
#Map Variable witch code=167 to CMOR Variable tas.
# All Mapping information are infile variable descriptions.
cdo.cmor('Amon,cn=tas,'
         'code=167,units=K,cell_methods=m',
         input=workdir+'example_mapping.grb')
# DEBUG - start =============================================================
CALL  :/envs/bin/cdo -O -s -cmor,Amon,cn=tas,code=167,units=K,cell_methods=m /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_mapping.grb
STDOUT:
STDERR:

cdo    cmor (Abort): CMOR support not compiled in!

# DEBUG - end ===============================================================
RETURNCODE:1
Error in calling operator cmor with:
>>> /envs/bin/cdo -O -s -cmor,Amon,cn=tas,code=167,units=K,cell_methods=m /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_mapping.grb<<<
STDOUT:
STDERR:
cdo    cmor (Abort): CMOR support not compiled in!
---------------------------------------------------------------------------
CDOException                              Traceback (most recent call last)
Cell In[13], line 3
      1 #Map Variable witch code=167 to CMOR Variable tas.
      2 # All Mapping information are infile variable descriptions.
----> 3 cdo.cmor('Amon,cn=tas,'
      4          'code=167,units=K,cell_methods=m',
      5          input=workdir+'example_mapping.grb')

File /envs/lib/python3.11/site-packages/cdo/cdo.py:505, in Cdo.__call__(self, *args, **kwargs)
    503             return None
    504         else:
--> 505             raise CDOException(**retvals)
    506 else:
    507     if kwargs["force"] or \
    508        (kwargs.__contains__("output") and not os.path.isfile(kwargs["output"])):

CDOException: (returncode:1) 
cdo    cmor (Abort): CMOR support not compiled in!
%%capture --no-stdout
#Write mapping information to mapping table:
with open(workdir+'mapping_table.txt', 'a') as mapping_table:
    mapping_table.write('&parameter cmor_name=tas code=167 units=K cell_methods=m /\n')
mapping_table.close()
#Select a specific variable in the command line to be mapped with mapping_table.txt:
cdo.cmor('Amon,cn=tas',
         'mapping_table='+workdir+'mapping_table.txt',
          input=workdir+'example_mapping.grb')
# DEBUG - start =============================================================
CALL  :/envs/bin/cdo -O -s -cmor,Amon,cn=tas,mapping_table=/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/mapping_table.txt /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_mapping.grb
STDOUT:
STDERR:

cdo    cmor (Abort): CMOR support not compiled in!

# DEBUG - end ===============================================================
RETURNCODE:1
Error in calling operator cmor with:
>>> /envs/bin/cdo -O -s -cmor,Amon,cn=tas,mapping_table=/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/mapping_table.txt /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_mapping.grb<<<
STDOUT:
STDERR:
cdo    cmor (Abort): CMOR support not compiled in!
---------------------------------------------------------------------------
CDOException                              Traceback (most recent call last)
Cell In[14], line 6
      4 mapping_table.close()
      5 #Select a specific variable in the command line to be mapped with mapping_table.txt:
----> 6 cdo.cmor('Amon,cn=tas',
      7          'mapping_table='+workdir+'mapping_table.txt',
      8           input=workdir+'example_mapping.grb')

File /envs/lib/python3.11/site-packages/cdo/cdo.py:505, in Cdo.__call__(self, *args, **kwargs)
    503             return None
    504         else:
--> 505             raise CDOException(**retvals)
    506 else:
    507     if kwargs["force"] or \
    508        (kwargs.__contains__("output") and not os.path.isfile(kwargs["output"])):

CDOException: (returncode:1) 
cdo    cmor (Abort): CMOR support not compiled in!
%%capture --no-stdout
#Process and map all variables which are in example_collect.grb with mtPERFECT.txt:
cdo.cmor('Amon',
         'mt='+workdir+'mtPERFECT.txt',
          input=workdir+'example_collect.grb')
# DEBUG - start =============================================================
CALL  :/envs/bin/cdo -O -s -cmor,Amon,mt=/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/mtPERFECT.txt /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_collect.grb
STDOUT:
STDERR:

cdo    cmor (Abort): CMOR support not compiled in!

# DEBUG - end ===============================================================
RETURNCODE:1
Error in calling operator cmor with:
>>> /envs/bin/cdo -O -s -cmor,Amon,mt=/builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/mtPERFECT.txt /builds/data-infrastructure-services/tutorials-and-use-cases/docs/source/cdo-incl-cmor/application/handson/example_collect.grb<<<
STDOUT:
STDERR:
cdo    cmor (Abort): CMOR support not compiled in!
---------------------------------------------------------------------------
CDOException                              Traceback (most recent call last)
Cell In[15], line 2
      1 #Process and map all variables which are in example_collect.grb with mtPERFECT.txt:
----> 2 cdo.cmor('Amon',
      3          'mt='+workdir+'mtPERFECT.txt',
      4           input=workdir+'example_collect.grb')

File /envs/lib/python3.11/site-packages/cdo/cdo.py:505, in Cdo.__call__(self, *args, **kwargs)
    503             return None
    504         else:
--> 505             raise CDOException(**retvals)
    506 else:
    507     if kwargs["force"] or \
    508        (kwargs.__contains__("output") and not os.path.isfile(kwargs["output"])):

CDOException: (returncode:1) 
cdo    cmor (Abort): CMOR support not compiled in!

7. Coordinates#

%%capture --no-stdout
#Define value for z_axis height2m as 1.5m:
with open(workdir+'.cdocmorinfo.txt', 'a') as info:
    info.write('height2m=1.5\n')
#
cdo.cmor('Amon,cn=tas,
         'z_axis=height2m',
         'mapping_table=mapping_table.txt',
          input=workdir+'example_T_3M.nc')
---------------------------------------------------------------------------
SyntaxError                               Traceback (most recent call last)
File /envs/lib/python3.11/site-packages/IPython/core/compilerop.py:86, in CachingCompiler.ast_parse(self, source, filename, symbol)
     81 def ast_parse(self, source, filename='<unknown>', symbol='exec'):
     82     """Parse code to an AST with the current compiler flags active.
     83 
     84     Arguments are exactly the same as ast.parse (in the standard library),
     85     and are passed to the built-in compile function."""
---> 86     return compile(source, filename, symbol, self.flags | PyCF_ONLY_AST, 1)

SyntaxError: unterminated string literal (detected at line 5) (1941613073.py, line 5)