ocean_model_skill_assessor.utils#

Utility functions.

Functions

calculate_anomaly(dd_in, monthly_mean[, varname])

Given monthly mean that is indexed by month of year, subtract it from time series to get anomaly.

calculate_distance(lons, lats)

Calculate distance (km), esp for transects.

check_catalog(cat[, source_names, skip_strings])

Check a catalog for required keys.

check_dataframe(dfd, no_Z)

Check dataframe for T, Z, lon, lat; reset indices; parse dates.

check_dataset(ds[, is_model, no_Z])

Check xarray datasets (usually model output) for necessary cf-xarray dims/coords.

coords1Dto2D(dam)

expand 1D coordinates to 2D

find_bbox(ds[, paths, mask, dd, alpha, save])

Determine bounds and boundary of model.

fix_dataset(model_var, ds)

Fill in info necessary to pass check_dataset() if possible.

get_mask(dsm, varname[, wetdry])

Return mask that matches x/y coords of var.

kwargs_search_from_model(kwargs_search, paths)

Adds spatial and/or temporal range from model output to dict.

open_catalogs(catalogs[, paths, skip_check, ...])

Initialize catalog objects from inputs.

open_vocab_labels(vocab_labels[, paths])

Open dict of vocab_labels if needed

open_vocabs(vocabs[, paths])

Open vocabularies, can input mix of forms.

read_model_file(fname_processed_model, no_Z, dsm)

_summary_

read_processed_data_file(...)

_summary_

save_processed_files(dfd, ...)

Save processed data and model output into files.

set_up_logging(verbose, paths[, mode, testing])

set up logging

shift_longitudes(dam)

Shift longitudes from 0 to 360 to -180 to 180 if necessary.

ocean_model_skill_assessor.utils.calculate_anomaly(dd_in, monthly_mean, varname=None)[source]#

Given monthly mean that is indexed by month of year, subtract it from time series to get anomaly.

Should work with both pd.Series/pd.DataFrame and xr. DataArray. Assume that variable in monthly_mean is the same as in the input time series. The way it works for DataArrays is by changing it to a DataFrame. Assumes this is a time series.

Returns dd as the type as DataFrame it is came in as Series and Dataset if it came in DataArray. It is pd.Series in the middle so this probably won’t work well for datasets that are more complex than time series.

ocean_model_skill_assessor.utils.calculate_distance(lons, lats)[source]#

Calculate distance (km), esp for transects.

ocean_model_skill_assessor.utils.check_catalog(cat, source_names=None, skip_strings=None)[source]#

Check a catalog for required keys.

Parameters:
  • catalogs (Catalog) – Catalog object

  • source_names (list) – Use these source_names instead of list(cat) if entered, for checking.

  • skip_strings (list of strings, optional) – If provided, source_names in catalog will only be checked for goodness if they do not contain one of skip_strings. For example, if skip_strings=[“_base”] then any source in the catalog whose name contains that string will be skipped.

ocean_model_skill_assessor.utils.check_dataframe(dfd, no_Z)[source]#

Check dataframe for T, Z, lon, lat; reset indices; parse dates.

ocean_model_skill_assessor.utils.check_dataset(ds, is_model=True, no_Z=False)[source]#

Check xarray datasets (usually model output) for necessary cf-xarray dims/coords.

If Dataset is model output (is_model=True), must have T, Z, vertical, latitude, longitude, and “positive” attribute must be associated with Z or vertical. But, if no_Z=True, neither Z, vertical, nor positive attribute need to be present.

If Dataset is not model output (is_model=False), must have T, Z, latitude, longitude. But, if no_Z=True, Z does not need to be present.

ocean_model_skill_assessor.utils.coords1Dto2D(dam)[source]#

expand 1D coordinates to 2D

Parameters:

dam (DataArray) – Model output variable to work on.

Returns:

Model output but with 2D coordinates in place of 1D coordinates, if applicable. Otherwise same as input.

Return type:

DataArray

ocean_model_skill_assessor.utils.find_bbox(ds, paths=None, mask=None, dd=1, alpha=5, save=False)[source]#

Determine bounds and boundary of model.

This does not know how to handle a rectilinear 1D lon/lat model with a mask

Parameters:
  • ds (DataArray) – xarray Dataset containing model output.

  • paths (Paths) – Paths object for finding paths to use.

  • mask (DataArray, optional) – Mask with 1’s for active locations and 0’s for masked.

  • dd (int, optional) – Number to decimate model output lon/lat, as a stride.

  • alpha (int, optional) – Number for alphashape to determine what counts as the convex hull. Larger number is more detailed, 1 is a good starting point.

  • save (bool, optional) – Input True to save.

Returns:

Contains the name of the longitude and latitude variables for ds, geographic bounding box of model output ([min_lon, min_lat, max_lon, max_lat]), low res and high res wkt representation of model boundary.

Return type:

List

Notes

This was originally from the package model_catalogs.

ocean_model_skill_assessor.utils.fix_dataset(model_var, ds)[source]#

Fill in info necessary to pass check_dataset() if possible.

Right now it is only for converting horizontal indices to lon/lat but conceivably could do more in the future. Looks for lon/lat being 2D coords.

Parameters:
  • model_var (Union[xr.DataArray,xr.Dataset]) – xarray object that needs some more info filled in

  • ds (Union[xr.DataArray,xr.Dataset]) – xarray object that has info that can be used to fill in model_var

Returns:

model_var with more information included, hopefully

Return type:

Union[xr.DataArray,xr.Dataset]

ocean_model_skill_assessor.utils.get_mask(dsm, varname, wetdry=False)[source]#

Return mask that matches x/y coords of var.

If no mask can be identified with .filter_by_attrs(flag_meanings=”land water”), instead will make one of non-nans for 1 horizontal grid cross-section of varname.

Parameters:
  • dsm (Dataset) – Model output

  • varname (str) – Name of variable in dsm.

  • wetdry (bool) – If True, selected mask must include “wetdry” in name and will use first time step.

Returns:

mask associated with varname in dsm

Return type:

DataArray

ocean_model_skill_assessor.utils.kwargs_search_from_model(kwargs_search, paths)[source]#

Adds spatial and/or temporal range from model output to dict.

Examines model output and uses the bounding box of the model as the search spatial range if needed, and the time range of the model as the search time search if needed. They are added into kwargs_search and the dict is returned.

Parameters:
  • kwargs_search (dict) – Keyword arguments to input to search on the server before making the catalog.

  • paths (Paths) – Paths object for finding paths to use.

Returns:

kwargs_search but with modifications if relevant.

Return type:

dict

Raises:

KeyError – If all of max_lon, min_lon, max_lat, min_lat and min_time, max_time are already specified along with model_name.

ocean_model_skill_assessor.utils.open_catalogs(catalogs, paths=None, skip_check=False, skip_strings=None)[source]#

Initialize catalog objects from inputs.

Parameters:
  • catalogs (Union[str, Catalog, Sequence]) – Catalog name(s) or list of names, or catalog object or list of catalog objects.

  • paths (Paths, optional) – Paths object for finding paths to use. Required if any catalog is a string referencing paths.

  • skip_check (bool) – If True, do not check catalogs. Use this for testing as needed. Default is False.

  • skip_strings (list of strings, optional) – If provided, source_names in catalog will only be checked for goodness if they do not contain one of skip_strings. For example, if skip_strings=[“_base”] then any source in the catalog whose name contains that string will be skipped.

Returns:

Catalogs, ready to use.

Return type:

list[Catalog]

ocean_model_skill_assessor.utils.open_vocab_labels(vocab_labels, paths=None)[source]#

Open dict of vocab_labels if needed

Parameters:
  • vocab_labels (Union[str, Vocab, Sequence, Path], optional) – Criteria to use to map from variable to attributes describing the variable. This is to be used with a key representing what variable to search for. This input is for the name of one or more existing vocabularies which are stored in a user application cache.

  • paths (Paths, optional) – Paths object for finding paths to use.

Returns:

dict of vocab_labels for plotting

Return type:

dict

ocean_model_skill_assessor.utils.open_vocabs(vocabs, paths=None)[source]#

Open vocabularies, can input mix of forms.

Parameters:
  • vocabs (Union[str, Vocab, Sequence, Path]) – Criteria to use to map from variable to attributes describing the variable. This is to be used with a key representing what variable to search for. This input is for the name of one or more existing vocabularies which are stored in a user application cache.

  • paths (Paths, optional) – Paths object for finding paths to use. Required if any input vocab is a str referencing paths.

Returns:

Single Vocab object with vocab stored in vocab.vocab

Return type:

Vocab

ocean_model_skill_assessor.utils.read_model_file(fname_processed_model, no_Z, dsm)[source]#

_summary_

Parameters:
  • fname_processed_model (Path) – Model file path

  • no_Z (bool) – _description_

  • dsm (Dataset) –

Return type:

Processed model output (Dataset)

ocean_model_skill_assessor.utils.read_processed_data_file(fname_processed_data, no_Z)[source]#

_summary_

Parameters:
  • fname_processed_data (Path) – Data file path

  • no_Z (bool) – _description_

Return type:

Processed data (DataFrame or Dataset)

ocean_model_skill_assessor.utils.save_processed_files(dfd, fname_processed_data, model_var, fname_processed_model)[source]#

Save processed data and model output into files.

Parameters:
  • dfd (Union[xr.Dataset, pd.DataFrame]) – Processed data

  • fname_processed_data (Path) – Data file path

  • model_var (xr.Dataset) – Processed model output

  • fname_processed_model (Path) – Model file path

ocean_model_skill_assessor.utils.set_up_logging(verbose, paths, mode='w', testing=False)[source]#

set up logging

ocean_model_skill_assessor.utils.shift_longitudes(dam)[source]#

Shift longitudes from 0 to 360 to -180 to 180 if necessary.

Parameters:

dam (Union[DataArray,Dataset]) – Object with model output to check

Returns:

Return model output with shifted longitudes, if it was necessary.

Return type:

Union[DataArray,Dataset]