import ocean_model_skill_assessor as omsa
import cf_pandas as cfp
import xroms

How to use ocean-model-skill-assessor#

… as a Python package. Other notebooks describe its command line interface uses.

But, this is written in parallel to the CLI demo, but will be more brief.

There are three steps to follow for a set of model-data validation, which is for one variable:

  1. Make a catalog for your model output.

  2. Make a catalog for your data.

  3. Run the comparison.

These steps will save files into a user application directory cache, along with a log. A project directory can be checked on the command line with omsa proj_path --project_name PROJECT_NAME.

project_name = "demo_local_package"

Make model catalog#

We’re using example ROMS model output that is available through xroms for our model.

url = xroms.datasets.CLOVER.fetch("ROMS_example_full_grid.nc")
kwargs = {
    "filenames": [url],
    "skip_entry_metadata": True,
}
cat_model = omsa.main.make_catalog(
                        catalog_type="local",
                        project_name=project_name,
                        catalog_name="model",
                        kwargs=kwargs,
                        return_cat=True,
)
Downloading file 'ROMS_example_full_grid.nc' from 'https://github.com/xoceanmodel/xroms/raw/main/xroms/data/ROMS_example_full_grid.nc' to '/home/docs/.cache/xroms'.
cat_model
model:
  args:
    description: Catalog of type local.
    name: model
  description: Catalog of type local.
  driver: intake.catalog.base.Catalog
  metadata: {}

Make data catalog#

Set up a catalog of the datasets with which you want to compare your model output. In this example, we use only known data file locations to create our catalog.

Note that we need to include the “featuretype” and “maptype” in the metadata for the data sources. More information can be found on these items in the docs.

filenames = ["https://erddap.sensors.axds.co/erddap/tabledap/gov_ornl_cdiac_coastalms_88w_30n.csvp?time%2Clatitude%2Clongitude%2Cz%2Csea_water_temperature&time%3E=2009-11-19T012%3A00%3A00Z&time%3C=2009-11-19T16%3A00%3A00Z",]

cat_data = omsa.make_catalog(project_name="demo_local_package",
                             catalog_type="local",
                             catalog_name="local",
                             kwargs=dict(filenames=filenames),
                             metadata={"featuretype": "timeSeries", "maptype": "point"})
[2023-11-27 22:03:59,878] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:216}
WARNING - Dataset gov_ornl_cdiac_coastalms_88w_30n had a timezone UTC which is being removed. Make sure the timezone matches the model output.
cat_data
local:
  args:
    description: Catalog of type local.
    name: local
  description: Catalog of type local.
  driver: intake.catalog.base.Catalog
  metadata: {}

Run comparison#

Now that the model output and dataset catalogs are prepared, we can run the comparison of the two.

At this point we need to select a single variable to compare between the model and datasets, and this requires a little extra input. Because we don’t know specifics about the format of any given input data file, variables will be interpreted with some flexibility in the form of a set of regular expressions. In the present case, we will compare the water temperature between the model and the datasets (the model output and datasets selected for our catalogs should contain the variable we want to compare). Several sets of regular expressions, called “vocabularies”, are available with the package to be used for this purpose, and in this case we will use one called “general” which should match many commonly-used variable names. “general” is selected under vocab_names, and the particular key from the general vocabulary that we are comparing is selected with key.

See the vocabulary here.

paths = omsa.paths.Paths()
cfp.Vocab(paths.VOCAB_PATH("general"))
{'temp': {'name': '(?i)^(?!.*(air|qc|status|atmospheric|bottom|dew)).*(temp|sst).*'}, 'salt': {'name': '(?i)^(?!.*(soil|qc|status|bottom)).*(sal|sss).*'}, 'ssh': {'name': '(?i)^(?!.*(qc|status)).*(sea_surface_height|surface_elevation|zeta).*'}, 'u': {'name': 'u$|(?i)(?=.*east)(?=.*vel)'}, 'v': {'name': 'v$|(?i)(?=.*north)(?=.*vel)'}, 'w': {'name': 'w$|(?i)(?=.*up)(?=.*vel)'}, 'water_dir': {'name': '(?i)^(?!.*(qc|status|air|wind))(?=.*dir)(?=.*water)'}, 'water_speed': {'name': '(?i)^(?!.*(qc|status|air|wind))(?=.*speed)(?=.*water)'}, 'wind_dir': {'name': '(?i)^(?!.*(qc|status|water))(?=.*dir)(?=.*wind)'}, 'wind_speed': {'name': '(?i)^(?!.*(qc|status|water))(?=.*speed)(?=.*wind)'}, 'sea_ice_u': {'name': '(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*u)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*x)(?=.*vel)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*east)(?=.*vel)'}, 'sea_ice_v': {'name': '(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*v)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*y)(?=.*vel)|(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*north)(?=.*vel)'}, 'sea_ice_area_fraction': {'name': '(?i)^(?!.*(qc|status))(?=.*sea)(?=.*ice)(?=.*area)(?=.*fraction)'}}

Now we run the model-data comparison. Check the API docs for details about the keyword inputs. Also note that the data has filler numbers for this time period which is why the comparison is so far off.

omsa.run(project_name="demo_local_package", catalogs=cat_data, model_name=cat_model,
         vocabs="general", key_variable="temp", interpolate_horizontal=False,
         check_in_boundary=False, plot_map=True, dd=5, alpha=20)
[2023-11-27 22:03:59,917] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1803}
INFO - Input parameters: {'catalogs': <Intake catalog: local>, 'project_name': 'demo_local_package', 'key_variable': 'temp', 'model_name': <Intake catalog: model>, 'vocabs': 'general', 'vocab_labels': None, 'ndatasets': None, 'kwargs_map': None, 'verbose': True, 'mode': 'w', 'testing': False, 'alpha': 20, 'dd': 5, 'preprocess': False, 'need_xgcm_grid': False, 'xcmocean_options': None, 'kwargs_xroms': None, 'locstream': True, 'interpolate_horizontal': False, 'horizontal_interp_code': 'delaunay', 'save_horizontal_interp_weights': True, 'want_vertical_interp': False, 'extrap': False, 'model_source_name': None, 'catalog_source_names': None, 'user_min_time': None, 'user_max_time': None, 'check_in_boundary': False, 'tidal_filtering': None, 'ts_mods': None, 'model_only': False, 'plot_map': True, 'no_Z': False, 'skip_mask': False, 'wetdry': False, 'plot_count_title': True, 'cache_dir': None, 'return_fig': False, 'override_model': False, 'override_processed': False, 'override_stats': False, 'override_plot': False, 'plot_description': None, 'kwargs_plot': None, 'skip_key_variable_check': False, 'kwargs': {}, 'paths': <ocean_model_skill_assessor.paths.Paths object at 0x7ff9cd41e9a0>, 'logger': <Logger ocean_model_skill_assessor.utils (INFO)>}
[2023-11-27 22:03:59,925] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1838}
INFO - Note that there are 1 datasets to use. This might take awhile.
[2023-11-27 22:03:59,925] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1855}
INFO - Catalog <Intake catalog: local>.
[2023-11-27 22:03:59,926] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1870}
INFO - 
source name: gov_ornl_cdiac_coastalms_88w_30n (1 of 1 for catalog <Intake catalog: local>.
[2023-11-27 22:04:00,080] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1069}
INFO - 
                        User time range: NaT to NaT.
                        Model time range: 2009-11-19 12:00:00 to 2009-11-19 16:00:00.
                        Data time range: 2009-11-19 12:17:00 to 2009-11-19 15:17:00.
                        Data lon range: -88.6 to -88.6.
                        Data lat range: 30.0 to 30.0.
[2023-11-27 22:04:00,081] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1946}
INFO - running gov_ornl_cdiac_coastalms_88w_30n for key_variable(s) temp from key_variable_list ['temp']
[2023-11-27 22:04:00,600] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:878}
INFO - Processed data file name is /home/docs/.cache/ocean-model-skill-assessor/demo_local_package/processed/local_gov_ornl_cdiac_coastalms_88w_30n_temp_data.csv.
[2023-11-27 22:04:00,600] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:879}
INFO - Processed model file name is /home/docs/.cache/ocean-model-skill-assessor/demo_local_package/processed/local_gov_ornl_cdiac_coastalms_88w_30n_temp_model.nc.
[2023-11-27 22:04:00,601] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:880}
INFO - model file name is /home/docs/.cache/ocean-model-skill-assessor/demo_local_package/model_output/local_gov_ornl_cdiac_coastalms_88w_30n_temp.nc.
[2023-11-27 22:04:00,602] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2004}
INFO - Figure name is /home/docs/.cache/ocean-model-skill-assessor/demo_local_package/out/local_gov_ornl_cdiac_coastalms_88w_30n_temp.png.
[2023-11-27 22:04:00,603] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2025}
INFO - No previously processed model output and data available for gov_ornl_cdiac_coastalms_88w_30n, so setting up now.
[2023-11-27 22:04:00,608] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1363}
INFO - Finding and saving mask to cache to /home/docs/.cache/ocean-model-skill-assessor/demo_local_package/mask_temp.nc.
[2023-11-27 22:04:00,608] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/utils.py:570}
INFO - Retrieving mask
[2023-11-27 22:04:04,437] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1158}
INFO - Calculating numerical domain boundary.
[2023-11-27 22:04:04,445] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:969}
WARNING - Dataset gov_ornl_cdiac_coastalms_88w_30n had a timezone UTC which is being removed. Make sure the timezone matches the model output.
[2023-11-27 22:04:04,528] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:793}
WARNING - the 'vertical' key cannot be identified in dam by cf-xarray. Maybe you need to include the xgcm grid and vertical metrics for xgcm grid, but maybe your variable does not have a vertical axis.
[2023-11-27 22:04:04,537] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:667}
INFO - Will not perform vertical interpolation and will find nearest depth to 0.0.
[2023-11-27 22:04:04,539] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1452}
INFO - Selecting model output at locations to match dataset gov_ornl_cdiac_coastalms_88w_30n.
[2023-11-27 22:04:04,596] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1484}
INFO - 
    Model coordinates found are Coordinates:
    xi_rho      int64 299
    eta_rho     int64 92
    lon_rho     float64 -88.6
    lat_rho     float64 29.97
    s_rho       float64 -0.01667
    npts        int64 0
  * ocean_time  (ocean_time) datetime64[ns] 2009-11-19T12:17:00 2009-11-19T15....
    
    Output information from finding nearest neighbors to requested points are {'distances': array([0.12069942]), 'eta_rho': array([92]), 'xi_rho': array([299])}.
[2023-11-27 22:04:04,600] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1524}
INFO - Trying to drop vertical coordinates time series
[2023-11-27 22:04:04,602] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1537}
INFO - Loading model output...
[2023-11-27 22:04:04,805] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:1644}
INFO - Saving model output to file...
[2023-11-27 22:04:04,969] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2337}
INFO - model file name is /home/docs/.cache/ocean-model-skill-assessor/demo_local_package/model_output/local_gov_ornl_cdiac_coastalms_88w_30n_temp.nc.
[2023-11-27 22:04:04,970] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2339}
INFO - Reading model output from file.
[2023-11-27 22:04:05,097] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2362}
INFO - Calculating stats for temp.
[2023-11-27 22:04:05,112] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/warnings.py:109}
WARNING - /home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/site-packages/numpy/lib/function_base.py:2853: RuntimeWarning: invalid value encountered in divide
  c /= stddev[:, None]
[2023-11-27 22:04:05,112] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/warnings.py:109}
WARNING - /home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/site-packages/numpy/lib/function_base.py:2854: RuntimeWarning: invalid value encountered in divide
  c /= stddev[None, :]
[2023-11-27 22:04:05,118] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/warnings.py:109}
WARNING - /home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/stats.py:100: RuntimeWarning: divide by zero encountered in double_scalars
  return float(1 - ((obs - model) ** 2).sum() / ((obs - obs_model) ** 2).sum())
[2023-11-27 22:04:05,124] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2380}
INFO - Saved stats file.
[2023-11-27 22:04:06,217] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2437}
INFO - Made plot for gov_ornl_cdiac_coastalms_88w_30n
.
[2023-11-27 22:04:06,372] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/warnings.py:109}
WARNING - /home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
[2023-11-27 22:04:08,003] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/warnings.py:109}
WARNING - /home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/conda/latest/lib/python3.9/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_land.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
[2023-11-27 22:04:30,682] {/home/docs/checkouts/readthedocs.org/user_builds/ocean-model-skill-assessor/checkouts/latest/ocean_model_skill_assessor/main.py:2451}
INFO - Finished analysis. Find plots, stats summaries, and log in /home/docs/.cache/ocean-model-skill-assessor/demo_local_package.
_images/a99f622a11d48f5ff4d7b719baa4924ffe23f505a5c90bbf8d9623d4fd4941a7.png _images/6ec171a25cf0a5e43237b3a7903a3e59dab0fb30f322265d0c307ed0b064faed.png

The plots show the time series comparisons for sea water temperatures of the model output and data at one location. Also shown is a map of the Mississippi river delta region where the model is located. An approximation of the numerical domain is shown along with the data location. Note that the comparison is poor because the data is missing for this time period.