import ocean_model_skill_assessor as omsa
from IPython.display import Code, Image

CLI demo of ocean-model-skill-assessor with known data files

This demo runs command line interface (CLI) commands only, which is accomplished in a Jupyter notebook by prefacing commands with !. To transfer these commands to a terminal window, remove the ! but otherwise keep commands the same.

More detailed docs about running with the CLI are available.

There are three steps to follow for a set of model-data validation, which is for one variable:

  1. Make a catalog for your model output.

  2. Make a catalog for your data.

  3. Run the comparison.

These steps will save files into a user application directory cache, along with a log. A project directory can be checked on the command line with omsa proj_path --project_name PROJECT_NAME.

Make model catalog

Set up a catalog file for your model output. The user can input necessary keyword arguments – through kwargs_open – so that xarray will be able to read in the model output. Generally it is good to use skip_entry_metadata when using the make_catalog command for model output since we are using only one model and the entry metadata is aimed at being able to compare datasets.

In the following command,

  • make_catalog is the function being run from OMSA

  • demo_local is the name of the project which will be used as the subdirectory name

  • local is the type of catalog to choose when making a catalog for the model output regardless of where the model output is stored

  • “model” is the catalog name which will be used for the file name and in the catalog itself

  • Specific kwargs to be input to the catalog command are

    • filenames which is a string describing where the model output can be found. If the model output is available through a sequence of filenames instead of a single server address, represent them with a single glob-style statement, for example, “/filepath/filenameprefix_*.nc”.

    • skip_entry_metadata use this when running make_catalog for model output

  • kwargs_open all keywords required for xr.open_dataset or xr.open_mfdataset to successfully read your model output.

# get local path for model output sample file from xroms
import xroms
url = xroms.datasets.CLOVER.fetch("ROMS_example_full_grid.nc")
!omsa make_catalog --project_name demo_local --catalog_type local --catalog_name model --kwargs filenames=$url skip_entry_metadata=True
[2024-10-30 14:52:04,644] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/main.py:432}
INFO - Catalog saved to /home/docs/.cache/ocean-model-skill-assessor/demo_local/model.yaml with 1 entries.


Make data catalog

Set up a catalog of the datasets with which you want to compare your model output. In this example, we use only known data file locations to create our catalog.

In this step, we use the same project_name as in the previous step so as to put the resulting catalog file in the same subdirectory, we create a catalog of type “local” since we have known data locations, we call this catalog file “local”, input the filenames as a list in quotes (this specific syntax is necessary for inputting a list in through the command line interface), and we input any keyword arguments necessary for reading the datasets.

In the following command:

  • make_catalog is the function being run from OMSA

  • demo_local is the name of the project which will be used as the subdirectory name

  • local is the type of catalog to choose when making a catalog for the known data files

  • “local” is the catalog name which will be used for the file name and in the catalog itself

  • Specific kwargs to be input to the catalog command are

    • filenames which is a string or a list of strings pointing to where the data files can be found. If you are using a list, the syntax for the command line interface is filenames="[file1,file2]".

  • kwargs_open all keywords required for xr.open_dataset or xr.open_mfdataset or pandas.open_csv, or whatever method will ultimately be used to successfully read your model output. These must be applicable to all datasets represted by filenames. If they are not, run this command multiple times, one for each set of filenames and kwargs_open that match.

!omsa make_catalog --project_name demo_local --catalog_type local --catalog_name local --kwargs filenames="[https://erddap.sensors.axds.co/erddap/tabledap/gov_ornl_cdiac_coastalms_88w_30n.csvp?time%2Clatitude%2Clongitude%2Cz%2Csea_water_temperature&time%3E=2009-11-19T012%3A00%3A00Z&time%3C=2009-11-19T16%3A00%3A00Z]" --metadata featuretype=timeSeries maptype=point
Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/bin/omsa", line 8, in <module>
    sys.exit(main())
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/CLI.py", line 161, in main
    omsa.make_catalog(
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/main.py", line 372, in make_catalog
    cat = make_local_catalog(
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/main.py", line 162, in make_local_catalog
    entries = {
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/main.py", line 163, in <dictcomp>
    PurePath(source.urlpath).stem: LocalCatalogEntry(
AttributeError: 'CSVSource' object has no attribute 'urlpath'

Run comparison

Now that the model output and dataset catalogs are prepared, we can run the comparison of the two.

In this step, we use the same project_name as the other steps so as to keep all files in the same subdirectory. We input the data catalog name under catalog_names and the model catalog name under model_name.

At this point we need to select a single variable to compare between the model and datasets, and this requires a little extra input. Because we don’t know anything about the format of any given input data file, variables will be interpreted with some flexibility in the form of a set of regular expressions. In the present case, we will compare the water temperature between the model and the datasets (the model output and datasets selected for our catalogs should contain the variable we want to compare). Several sets of regular expressions, called “vocabularies”, are available with the package to be used for this purpose, and in this case we will use one called “general” which should match many commonly-used variable names. “general” is selected under vocab_names, and the particular key from the general vocabulary that we are comparing is selected with key.

See the vocabulary here.

import cf_pandas as cfp

paths = omsa.paths.Paths()
vocab = cfp.Vocab(paths.VOCAB_PATH("general"))
vocab
{'temp': {'name': '(?i)^(?!.*(air|qc|status|atmospheric|bottom|dew)).*(temp|sst).*'}, 'salt': {'name': '(?i)^(?!.*(soil|qc|status|bottom)).*(sal|sss).*'}, 'ssh': {'name': '(?i)^(?!.*(qc|status)).*(sea_surface_height|surface_elevation|zeta).*'}, 'u': {'name': 'u$|(?i)(?=.*east)(?=.*vel)'}, 'v': {'name': 'v$|(?i)(?=.*north)(?=.*vel)'}, 'w': {'name': 'w$|(?i)(?=.*up)(?=.*vel)'}, 'water_dir': {'name': '(?i)^(?!.*(qc|status|air|wind))(?=.*dir)(?=.*water)'}, 'water_speed': {'name': '(?i)^(?!.*(qc|status|air|wind))(?=.*speed)(?=.*water)'}, 'wind_dir': {'name': '(?i)^(?!.*(qc|status|water))(?=.*dir)(?=.*wind)'}, 'wind_speed': {'name': '(?i)^(?!.*(qc|status|water))(?=.*speed)(?=.*wind)'}, 'sea_ice_u': {'name': '(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*u)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*x)(?=.*vel)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*east)(?=.*vel)'}, 'sea_ice_v': {'name': '(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*v)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*y)(?=.*vel)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*north)(?=.*vel)'}, 'sea_ice_area_fraction': {'name': '(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*area)(?=.*fraction)'}}

In the following command:

  • run is the function being run from OMSA

  • demo_local is the name of the project which will be used as the subdirectory name

  • catalog_names are the names of any catalogs with datasets to include in the comparison. In this case we have just one called “local”

  • model_name is the name of the model catalog we previously created

  • vocab_names are the names of the vocabularies to use for interpreting which variable to compare from the model output and datasets. If multiple are input, they are combined together. The variable nicknames need to match in the vocabularies to be interpreted together.

  • key is the nickname or alias of the variable as given in the input vocabulary

!omsa run --project_name demo_local --catalog_names local --model_name model --vocab_names general \
        --key temp \
      --kwargs_map label_with_station_name=True \
          --more_kwargs interpolate_horizontal=False check_in_boundary=False plot_map=True dd=5 alpha=20
[2024-10-30 14:52:23,781] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/main.py:1996}
INFO - Input parameters: {'catalogs': ['local'], 'project_name': 'demo_local', 'key_variable': 'temp', 'model_name': 'model', 'vocabs': ['general'], 'vocab_labels': None, 'ndatasets': None, 'kwargs_map': {'label_with_station_name': True}, 'verbose': True, 'mode': 'w', 'testing': False, 'alpha': 20, 'dd': 5, 'preprocess': False, 'need_xgcm_grid': False, 'xcmocean_options': None, 'kwargs_xroms': None, 'locstream': True, 'interpolate_horizontal': False, 'horizontal_interp_code': 'delaunay', 'save_horizontal_interp_weights': True, 'want_vertical_interp': False, 'want_locstreamZ': False, 'extrap': False, 'model_source_name': None, 'override_chunks': None, 'catalog_source_names': None, 'user_min_time': None, 'user_max_time': None, 'check_in_boundary': False, 'tidal_filtering': None, 'ts_mods': None, 'model_only': False, 'plot_map': True, 'no_Z': False, 'skip_mask': False, 'override_mask_lon': None, 'known_model_depth_attr_positive': None, 'wetdry': False, 'plot_count_title': True, 'cache_dir': None, 'return_fig': False, 'override_model': False, 'override_processed': False, 'override_stats': False, 'override_plot': False, 'plot_description': None, 'kwargs_plot': None, 'skip_key_variable_check': False, 'kwargs': {}, 'paths': <ocean_model_skill_assessor.paths.Paths object at 0x7fc9b7c83f70>, 'logger': <Logger ocean_model_skill_assessor.utils (INFO)>}

Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/bin/omsa", line 8, in <module>
    sys.exit(main())
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/CLI.py", line 210, in main
    omsa.main.run(
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/main.py", line 2023, in run
    cats = open_catalogs(catalogs, paths, skip_strings=["_base", "_all", "_tidecons"])
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/stable/ocean_model_skill_assessor/utils.py", line 367, in open_catalogs
    cat = intake.open_catalog(paths.CAT_PATH(catalog))
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/intake/__init__.py", line 186, in open_catalog
    return registry[driver](uri, **kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/intake/catalog/local.py", line 617, in __init__
    super(YAMLFileCatalog, self).__init__(**kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/intake/catalog/base.py", line 128, in __init__
    self.force_reload()
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/intake/catalog/base.py", line 186, in force_reload
    self._load()
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/intake/catalog/local.py", line 647, in _load
    with file_open as f:
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/fsspec/core.py", line 105, in __enter__
    f = self.fs.open(self.path, mode=mode)
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/fsspec/spec.py", line 1301, in open
    f = self._open(
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/fsspec/implementations/local.py", line 195, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/fsspec/implementations/local.py", line 359, in __init__
    self._open()
  File "/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/fsspec/implementations/local.py", line 364, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/docs/.cache/ocean-model-skill-assessor/demo_local/local.yaml'

Look at results

Now we can look at the results from our comparison! You can find the location of the resultant files printed at the end of the run command output above. Or you can find the path to the project directory while in Python with:

paths = omsa.paths.Paths("demo_local")
paths.PROJ_DIR
PosixPath('/home/docs/.cache/ocean-model-skill-assessor/demo_local')

Or you can use a command:

!omsa proj_path --project_name demo_local
/home/docs/.cache/ocean-model-skill-assessor/demo_local

Here we know the names of the files so show them inline.

First we see a map of the area around the Mississippi river delta, along with a red line outlining the approximate domain of the numerical model, and 1 black dot indicating 1 data location, marked with a the station name.

Image(paths.OUT_DIR / "map.png")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1045, in Image._data_and_metadata(self, always_both)
   1044 try:
-> 1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:

TypeError: a bytes-like object is required, not 'str'

The above exception was the direct cause of the following exception:

FileNotFoundError                         Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/formatters.py:974, in MimeBundleFormatter.__call__(self, obj, include, exclude)
    971     method = get_real_method(obj, self.print_method)
    973     if method is not None:
--> 974         return method(include=include, exclude=exclude)
    975     return None
    976 else:

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1035, in Image._repr_mimebundle_(self, include, exclude)
   1033 if self.embed:
   1034     mimetype = self._mimetype
-> 1035     data, metadata = self._data_and_metadata(always_both=True)
   1036     if metadata:
   1037         metadata = {mimetype: metadata}

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1047, in Image._data_and_metadata(self, always_both)
   1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:
-> 1047     raise FileNotFoundError(
   1048         "No such file or directory: '%s'" % (self.data)) from e
   1049 md = {}
   1050 if self.metadata:

FileNotFoundError: No such file or directory: '/home/docs/.cache/ocean-model-skill-assessor/demo_local/out/map.png'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1045, in Image._data_and_metadata(self, always_both)
   1044 try:
-> 1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:

TypeError: a bytes-like object is required, not 'str'

The above exception was the direct cause of the following exception:

FileNotFoundError                         Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/formatters.py:344, in BaseFormatter.__call__(self, obj)
    342     method = get_real_method(obj, self.print_method)
    343     if method is not None:
--> 344         return method()
    345     return None
    346 else:

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1067, in Image._repr_png_(self)
   1065 def _repr_png_(self):
   1066     if self.embed and self.format == self._FMT_PNG:
-> 1067         return self._data_and_metadata()

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1047, in Image._data_and_metadata(self, always_both)
   1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:
-> 1047     raise FileNotFoundError(
   1048         "No such file or directory: '%s'" % (self.data)) from e
   1049 md = {}
   1050 if self.metadata:

FileNotFoundError: No such file or directory: '/home/docs/.cache/ocean-model-skill-assessor/demo_local/out/map.png'
<IPython.core.display.Image object>

Here we see a time series comparison for station “gov_ornl_cdiac_coastalms_88w_30n”. It shows in black the temperature values from the data and in red the comparable values from the model. The comparison time range is November 19, 2009 from 12 to 15:30. The lines are not similar because the data is actually missing during this time period. Statistical comparisons are also available in the title text.

Image(paths.OUT_DIR / "local_gov_ornl_cdiac_coastalms_88w_30n_temp.png")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1045, in Image._data_and_metadata(self, always_both)
   1044 try:
-> 1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:

TypeError: a bytes-like object is required, not 'str'

The above exception was the direct cause of the following exception:

FileNotFoundError                         Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/formatters.py:974, in MimeBundleFormatter.__call__(self, obj, include, exclude)
    971     method = get_real_method(obj, self.print_method)
    973     if method is not None:
--> 974         return method(include=include, exclude=exclude)
    975     return None
    976 else:

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1035, in Image._repr_mimebundle_(self, include, exclude)
   1033 if self.embed:
   1034     mimetype = self._mimetype
-> 1035     data, metadata = self._data_and_metadata(always_both=True)
   1036     if metadata:
   1037         metadata = {mimetype: metadata}

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1047, in Image._data_and_metadata(self, always_both)
   1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:
-> 1047     raise FileNotFoundError(
   1048         "No such file or directory: '%s'" % (self.data)) from e
   1049 md = {}
   1050 if self.metadata:

FileNotFoundError: No such file or directory: '/home/docs/.cache/ocean-model-skill-assessor/demo_local/out/local_gov_ornl_cdiac_coastalms_88w_30n_temp.png'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1045, in Image._data_and_metadata(self, always_both)
   1044 try:
-> 1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:

TypeError: a bytes-like object is required, not 'str'

The above exception was the direct cause of the following exception:

FileNotFoundError                         Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/formatters.py:344, in BaseFormatter.__call__(self, obj)
    342     method = get_real_method(obj, self.print_method)
    343     if method is not None:
--> 344         return method()
    345     return None
    346 else:

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1067, in Image._repr_png_(self)
   1065 def _repr_png_(self):
   1066     if self.embed and self.format == self._FMT_PNG:
-> 1067         return self._data_and_metadata()

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/display.py:1047, in Image._data_and_metadata(self, always_both)
   1045     b64_data = b2a_base64(self.data, newline=False).decode("ascii")
   1046 except TypeError as e:
-> 1047     raise FileNotFoundError(
   1048         "No such file or directory: '%s'" % (self.data)) from e
   1049 md = {}
   1050 if self.metadata:

FileNotFoundError: No such file or directory: '/home/docs/.cache/ocean-model-skill-assessor/demo_local/out/local_gov_ornl_cdiac_coastalms_88w_30n_temp.png'
<IPython.core.display.Image object>
import yaml
with open(paths.OUT_DIR / "local_gov_ornl_cdiac_coastalms_88w_30n_temp.yaml", "r") as stream:
    stats = yaml.safe_load(stream)
stats
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[11], line 2
      1 import yaml
----> 2 with open(paths.OUT_DIR / "local_gov_ornl_cdiac_coastalms_88w_30n_temp.yaml", "r") as stream:
      3     stats = yaml.safe_load(stream)
      4 stats

File ~/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/stable/lib/python3.9/site-packages/IPython/core/interactiveshell.py:310, in _modified_open(file, *args, **kwargs)
    303 if file in {0, 1, 2}:
    304     raise ValueError(
    305         f"IPython won't let you open fd={file} by default "
    306         "as it is likely to crash IPython. If you know what you are doing, "
    307         "you can use builtins' open."
    308     )
--> 310 return io_open(file, *args, **kwargs)

FileNotFoundError: [Errno 2] No such file or directory: '/home/docs/.cache/ocean-model-skill-assessor/demo_local/out/local_gov_ornl_cdiac_coastalms_88w_30n_temp.yaml'