ocean_model_skill_assessor.main#

Main run functions.

Functions

`make_catalog`(catalog_type, project_name[, ...])	Make a catalog given input selections.
`make_local_catalog`(filenames[, filetype, ...])	Make an intake catalog from specified data files, including model output locations.
`run`(catalogs, project_name, key_variable, ...)	Run the model-data comparison.

ocean_model_skill_assessor.main._check_prep_narrow_data(dd, key_variable_data, source_name, maps, vocab, user_min_time, user_max_time, data_min_time, data_max_time, logger=None)[source]#

Check, prep, and narrow the data in time range.

Parameters:

dd (Union[pd.DataFrame, xr.Dataset]) – Dataset.
key_variable_data (str) – Name of variable to access from dataset.
source_name (str) – Name of dataset we are accessing from the catalog.
maps (list) – Each entry is a list of information about a dataset; the last entry is for the present source_name or dataset. Each entry contains [min_lon, max_lon, min_lat, max_lat, source_name] and possibly an additional element containing “maptype”.
vocab (Vocab) – Way to find the criteria to use to map from variable to attributes describing the variable. This is to be used with a key representing what variable to search for.
user_min_time (pd.Timestamp) – If this is input, it will be used as the min time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
user_max_time (pd.Timestamp) – If this is input, it will be used as the max time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
data_min_time (pd.Timestamp) – The min time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_min_time, then the constraint time.
data_max_time (pd.Timestamp) – The max time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_max_time, then the constraint time.
logger (optional) – logger, by default None

Returns:

dd: data container that has been checked and processed. Will be None if a problem has been detected.
maps: list of data information. If there was a problem with this dataset, the final entry in maps representing the dataset will have been deleted.

Return type:

tuple

ocean_model_skill_assessor.main._check_time_ranges(source_name, data_min_time, data_max_time, model_min_time, model_max_time, user_min_time, user_max_time, maps, logger=None)[source]#

Compare time ranges in case should skip dataset source_name.

Parameters:

source_name (str) – Name of dataset we are accessing from the catalog.
data_min_time (pd.Timestamp) – The min time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_min_time, then the constraint time.
data_max_time (pd.Timestamp) – The max time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_max_time, then the constraint time.
user_min_time (pd.Timestamp) – If this is input, it will be used as the min time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
user_max_time (pd.Timestamp) – If this is input, it will be used as the max time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
model_min_time (pd.Timestamp) – Min model time step
model_max_time (pd.Timestamp) – Max model time step
maps (list) – Each entry is a list of information about a dataset; the last entry is for the present source_name or dataset. Each entry contains [min_lon, max_lon, min_lat, max_lat, source_name] and possibly an additional element containing “maptype”.
logger (logger, optional) – Logger for messages.

Returns:

skip_dataset: bool that is True if this dataset should be skipped
maps: list of dataset information with the final entry (representing the present dataset) removed if skip_dataset is True.

Return type:

tuple

ocean_model_skill_assessor.main._choose_depths(dd, model_depth_attr_positive, no_Z, want_vertical_interp, logger=None)[source]#

Determine depths to interpolate to, if any.

This assumes the data container does not have indices, or at least no depth indices.

Parameters:

dd (DataFrame or Dataset) – Data container
model_depth_attr_positive (str) – result of model.cf[“Z”].attrs[“positive”]: “up” or “down”, from model
no_Z (bool) – If True, set Z=None so no vertical interpolation or selection occurs. Do this if your variable has no concept of depth, like the sea surface height.
want_vertical_interp (bool) – This is False unless the user wants to specify that vertical interpolation should happen. This is used in only certain cases but in those cases it is important so that it is known to interpolate instead of try to figure out a vertical level index (which is not possible currently).
logger (logger, optional) – Logger for messages.

Returns:

dd – Possibly modified Dataset with sign of depths to match model
Z – Depths to interpolate to with sign that matches the model depths.
vertical_interp – Flag, True if we should interpolate vertically, False if not.

ocean_model_skill_assessor.main._dam_from_dsm(dsm2, key_variable, key_variable_data, source_metadata, no_Z, logger=None)[source]#

Select or calculate variable from Dataset.

cf-xarray needs to work for Z, T, longitude, latitude after this

Parameters:

dsm2 (Dataset) – Dataset containing model output. If this is being run from main, the model output has already been narrowed to the relevant time range.
key_variable (str, dict) – Information to select variable from Dataset. Will be a dict if something needs to be calculated or accessed. In the more simple case will be a string containing the key variable name that can be interpreted with cf-xarray to access the variable of interest from the Dataset.
key_variable_data (str) – A string containing the key variable name that can be interpreted with cf-xarray to access the variable of interest from the Dataset.
source_metadata (dict) – Metadata for dataset source. Accessed by cat[source_name].metadata.
no_Z (bool) – If True, set Z=None so no vertical interpolation or selection occurs. Do this if your variable has no concept of depth, like the sea surface height.
logger (logger, optional) – Logger for messages.

Returns:

Single variable DataArray from Dataset.

Return type:

DataArray

ocean_model_skill_assessor.main._find_data_time_range(cat, source_name)[source]#

Determine min and max data times.

Parameters:

cat (Catalog) – Catalog that contains dataset source_name from which to find data time range.
source_name (str) – Name of dataset within cat to examine.

Returns:

data_min_time (pd.Timestamp) – The min time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_min_time, then the constraint time. If “Z” is present to indicate UTC timezone, it is removed.
data_max_time (pd.Timestamp) – The max time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_max_time, then the constraint time. If “Z” is present to indicate UTC timezone, it is removed.

ocean_model_skill_assessor.main._initial_model_handling(model_name, paths, model_source_name=None)[source]#

Initial model handling.

cf-xarray needs to be able to identify Z, T, longitude, latitude coming out of here.

Parameters:

model_name (str, Catalog) – Name of catalog for model output, created with make_catalog call, or Catalog instance.
paths (Paths) – Paths object for finding paths to use.
model_source_name (str, optional) – Use this to access a specific source in the input model_catalog instead of otherwise just using the first source in the catalog.

Returns:

Dataset pointing to model output.

Return type:

Dataset

ocean_model_skill_assessor.main._is_outside_boundary(p1, lon, lat, source_name, logger=None)[source]#

Checks point to see if is outside model domain.

This currently assumes that the dataset is fixed in space.

Parameters:

p1 (shapely.Polygon) – Model domain boundary
lon (float) – Longitude of point to compare with model domain boundary
lat (float) – Latitude of point to compare with model domain boundary
source_name (str) – Name of dataset within cat to examine.
logger (optional) – logger, by default None

Returns:

True if lon, lat point is outside the model domain boundary, otherwise False.

Return type:

bool

ocean_model_skill_assessor.main._narrow_model_time_range(dsm, user_min_time, user_max_time, model_min_time, model_max_time, data_min_time, data_max_time)[source]#

Narrow the model time range to approximately what is needed, to save memory.

If user_min_time and user_max_time were input and are not null values and are narrower than the model time range, use those to control time range.

Otherwise use data_min_time and data_max_time to narrow the time range, but add 1 model timestep on either end to make sure to have extra model output if need to interpolate in that range.

Do not deal with time in detail here since that will happen when the model and data are “aligned” a little later. For now, just return a slice of model times, outside of the extract_model code since not interpolating yet. not dealing with case that data is available before or after model but overlapping rename dsm since it has fewer times now and might need them for the other datasets

Parameters:

dsm (xr.Dataset) – model dataset
user_min_time (pd.Timestamp) – If this is input, it will be used as the min time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
user_max_time (pd.Timestamp) – If this is input, it will be used as the max time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
model_min_time (pd.Timestamp) – Min model time step
model_max_time (pd.Timestamp) – Max model time step
data_min_time (pd.Timestamp) – The min time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_min_time, then the constraint time.
data_max_time (pd.Timestamp) – The max time in the dataset catalog metadata, or if there is a constraint in the metadata such as an ERDDAP catalog allows, and it is more constrained than data_max_time, then the constraint time.

Returns:

Model dataset, but narrowed in time.

Return type:

xr.Dataset

ocean_model_skill_assessor.main._process_model(dsm2, preprocess, need_xgcm_grid, kwargs_xroms, logger=None)[source]#

Process model output a second time, possibly.

Parameters:

dsm2 (xr.Dataset) – Model output Dataset, already narrowed in time.
preprocess (bool) – True to preprocess.
need_xgcm_grid (bool) – True if need to find xgcm grid object.
kwargs_xroms (dict) – Keyword arguments to pass to xroms.
logger (optional) – logger, by default None

Returns:

dsm2: Model output, possibly modified
grid: xgcm grid object or None
preprocessed: bool that is True if model output was processed in this function

Return type:

tuple

ocean_model_skill_assessor.main._processed_file_names(fname_processed_orig, dfd_type, user_min_time, user_max_time, paths, ts_mods, logger=None)[source]#

Determine file names for base of stats and figure names and processed data and model names

fname_processed_orig: no info about time modifications fname_processed: fully specific name fname_processed_data: processed data file fname_processed_model: processed model file

Parameters:

fname_processed_orig (str) – Filename based but without modification if user_min_time and user_max_time were input. Does include info about ts_mods if present.
dfd_type (type) – pd.DataFrame or xr.Dataset depending on the data container type.
user_min_time (pd.Timestamp) – If this is input, it will be used as the min time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
user_max_time (pd.Timestamp) – If this is input, it will be used as the max time for the model. At this point in the code, it will be a pandas Timestamp though could be “NaT” (a null time value).
paths (Paths) – Paths object for finding paths to use.
ts_mods (list) – list of time series modifications to apply to data and model. Can be an empty list if no modifications to apply.
logger (logger, optional) – Logger for messages.

Returns:

fname_processed: base to be used for stats and figure
fname_processed_data: file name for processed data
fname_processed_model: file name for processed model
model_file_name: (unprocessed) model output

Return type:

tuple of Paths

ocean_model_skill_assessor.main._return_data_locations(maps, dd, featuretype, logger=None)[source]#

Return lon, lat locations from dataset.

Parameters:

maps (list) – Each entry is a list of information about a dataset; the last entry is for the present source_name or dataset. Each entry contains [min_lon, max_lon, min_lat, max_lat, source_name] and possibly an additional element containing “maptype”.
dd (Union[pd.DataFrame, xr.Dataset]) – Dataset
featuretype (str) – NCEI feature type for dataset
logger (optional) – logger, by default None

Returns:

lons: float or array of floats
lats: float or array of floats

Return type:

tuple

ocean_model_skill_assessor.main._return_mask(mask, dsm, lon_name, wetdry, key_variable_data, paths, logger=None)[source]#

Find or calculate and check mask.

Parameters:

mask (xr.DataArray or None) – Values are 1 for active cells and 0 for inactive grid cells in the model dsm.
dsm (xr.Dataset) – Model output Dataset
lon_name (str) – variable name for longitude in dsm.
wetdry (bool) – Adjusts the logic in the search for mask such that if True, selected mask must include “wetdry” in name and will use first time step.
key_variable_data (str) – Key name of variable
paths (Paths) – Paths to files and directories for this project.
logger – optional

Returns:

Mask

Return type:

DataArray

ocean_model_skill_assessor.main._return_p1(paths, dsm, mask, alpha, dd, logger=None)[source]#

Find and return the model domain boundary.

Parameters:

paths (Paths) – _description_
dsm (xr.Dataset) – _description_
mask (xr.DataArray or None) – Values are 1 for active cells and 0 for inactive grid cells in the model dsm.
alpha (int, optional) – Number for alphashape to determine what counts as the convex hull. Larger number is more detailed, 1 is a good starting point.
dd (int, optional) – Number to decimate model output lon/lat, as a stride.
skip_mask (bool) – Allows user to override mask behavior and keep it as None. Good for testing. Default False.
logger (_type_, optional) – _description_, by default None

Returns:

Model domain boundary

Return type:

shapely.Polygon

ocean_model_skill_assessor.main._select_process_save_model(select_kwargs, source_name, model_source_name, model_file_name, save_horizontal_interp_weights, key_variable_data, maps, paths, logger=None)[source]#

Select model output, process, and save to file

Parameters:

select_kwargs (dict) – Keyword arguments to send to em.select() for model extraction
source_name (str) – Name of dataset within cat to examine.
model_source_name (str) – Source name for model in the model catalog
model_file_name (pathlib.Path) – Path to where to save model output
save_horizontal_interp_weights (bool) – Default True. Whether or not to save horizontal interp info like Delaunay triangulation to file. Set to False to not save which is useful for testing.
key_variable_data (str) – Name of variable to select, to be interpreted with cf-xarray
maps (list) – Each entry is a list of information about a dataset; the last entry is for the present source_name or dataset. Each entry contains [min_lon, max_lon, min_lat, max_lat, source_name] and possibly an additional element containing “maptype”.
paths (Paths) – Paths object for finding paths to use.
logger (logger, optional) – Logger for messages.

Returns:

model_var: xr.Dataset with selected model output
skip_dataset: True if we should skip this dataset due to checks in this function
maps: Same as input except might be missing final entry if skipping this dataset

Return type:

tuple

ocean_model_skill_assessor.main.make_catalog(catalog_type, project_name, catalog_name=None, description=None, metadata=None, kwargs=None, kwargs_search=None, kwargs_open=None, skip_strings=None, vocab=None, return_cat=True, save_cat=False, verbose=True, mode='w', testing=False, cache_dir=None)[source]#

Make a catalog given input selections.

Parameters:

catalog_type (str) – Which type of catalog to make? Options are “erddap”, “axds”, or “local”.
project_name (str) – Subdirectory in cache dir to store files associated together.
catalog_name (str, optional) – Catalog name, with or without suffix of yaml. Otherwise a default name based on the catalog type will be used.
description (str, optional) – Description for catalog.
metadata (dict, optional) – Catalog metadata.
kwargs (dict, optional) – Available keyword arguments for catalog types. Find more information about options in the original docs for each type. Some inputs might be required, depending on the catalog type.
kwargs_search (dict, optional) –
Keyword arguments to input to search on the server before making the catalog. These are not used with make_local_catalog(); only for catalog types “erddap” and “axds”. Options are:
- to search by bounding box: include all of min_lon, max_lon, min_lat, max_lat: (int, float). Longitudes must be between -180 to +180.
- to search within a datetime range: include both of min_time, max_time: interpretable datetime string, e.g., “2021-1-1”
- to search using a textual keyword: include search_for as a string.
- model_name can be input in place of either the spatial box or the time range or both in which case those values will be found from the model output. model_name should match a catalog file in the directory described by project_name.
kwargs_open (dict, optional) – Keyword arguments to save into local catalog for model to pass on to xr.open_mfdataset call or pandas open_csv. Only for use with catalog_type=local.
skip_strings (list of strings, optional) – If provided, source_names in catalog will only be checked for goodness if they do not contain one of skip_strings. For example, if skip_strings=[“_base”] then any source in the catalog whose name contains that string will be skipped.
vocab (str, Vocab, Path, optional) – Way to find the criteria to use to map from variable to attributes describing the variable. This is to be used with a key representing what variable to search for.
return_cat (bool, optional) – Return catalog. For when using as a Python package instead of with command line.
save_cat (bool, optional) – Save catalog to disk into project directory under catalog_name.
verbose (bool, optional) – Print useful runtime commands to stdout if True as well as save in log, otherwise silently save in log.
mode (str, optional) – mode for logging file. Default is to overwrite an existing logfile, but can be changed to other modes, e.g. “a” to instead append to an existing log file.
testing (boolean, optional) – Set to True if testing so warnings come through instead of being logged.
cache_dir (str, Path) – Pass on to omsa.paths to set cache directory location if you don’t want to use the default. Good for testing.

ocean_model_skill_assessor.main.make_local_catalog(filenames, filetype=None, name='local_catalog', description='Catalog of user files.', metadata=None, metadata_catalog=None, skip_entry_metadata=False, skip_strings=None, kwargs_open=None, logger=None)[source]#

Make an intake catalog from specified data files, including model output locations.

Pass keywords for xarray for model output into the catalog through kwargs_xarray.

kwargs_open and metadata must be the same for all filenames. If it is not, make multiple catalogs and you can input them individually into the run command.

Parameters:

filenames (list of paths) – Where to find dataset(s) from which to make local catalog.
filetype (str, optional) – Type of the input filenames, if you don’t want the function to try to guess. Must be in the form that can go into intake as f”open_{filetype}”.
name (str, optional) – Name for catalog.
description (str, optional) – Description for catalog.
metadata (dict, optional) – Metadata for individual source. If input dataset does not include the longitude and latitude position(s), you will need to include it in the metadata as keys minLongitude, minLatitude, maxLongitude, maxLatitude.
metadata_catalog (dict, optional) – Metadata for catalog.
skip_entry_metadata (bool, optional) – This is useful for testing in which case we don’t want to actually read the file. If you are making a catalog file for a model, you may want to set this to True to avoid reading it all in for metadata.
skip_strings (list of strings, optional) – If provided, source_names in catalog will only be checked for goodness if they do not contain one of skip_strings. For example, if skip_strings=[“_base”] then any source in the catalog whose name contains that string will be skipped.
kwargs_open (dict, optional) – Keyword arguments to pass on to the appropriate intake open_* call for model or dataset.

Returns:

Intake catalog with an entry for each dataset represented by a filename.

Return type:

Catalog

Examples

Make catalog to represent local or remote files with specific locations:

>>> make_local_catalog([filename1, filename2])

Make catalog to represent model output:

>>> make_local_catalog([model output location], skip_entry_metadata=True, kwargs_open={"drop_variables": "tau"})

ocean_model_skill_assessor.main.run(catalogs, project_name, key_variable, model_name, vocabs=None, vocab_labels=None, ndatasets=None, kwargs_map=None, verbose=True, mode='w', testing=False, alpha=5, dd=2, preprocess=False, need_xgcm_grid=False, xcmocean_options=None, kwargs_xroms=None, locstream=True, interpolate_horizontal=True, horizontal_interp_code='delaunay', save_horizontal_interp_weights=True, want_vertical_interp=False, extrap=False, model_source_name=None, catalog_source_names=None, user_min_time=None, user_max_time=None, check_in_boundary=True, tidal_filtering=None, ts_mods=None, model_only=False, plot_map=True, no_Z=False, skip_mask=False, wetdry=False, plot_count_title=True, cache_dir=None, return_fig=False, override_model=False, override_processed=False, override_stats=False, override_plot=False, plot_description=None, kwargs_plot=None, skip_key_variable_check=False, **kwargs)[source]#

Run the model-data comparison.

Note that timezones are assumed to match between the model output and data.

To avoid calculating a mask you need to input skip_mask=True, check_in_boundary=False, and plot_map=False.

Parameters:

catalogs (str, list, Catalog) – Catalog name(s) or list of names, or catalog object or list of catalog objects. Datasets will be accessed from catalog entries.
project_name (str) – Subdirectory in cache dir to store files associated together.
key_variable (str, dict) – Key in vocab(s) representing variable to compare between model and datasets.
model_name (str, Catalog) – Name of catalog for model output, created with make_catalog call, or Catalog instance.
vocabs (str, list, Vocab, PurePath, optional) – Criteria to use to map from variable to attributes describing the variable. This is to be used with a key representing what variable to search for. This input is for the name of one or more existing vocabularies which are stored in a user application cache. This should be supplied, however it is made optional because it could be provided by setting it outside of the OMSA code.
vocab_labels (dict, optional) – Ultimately a dictionary whose keys match the input vocab and values have strings to be used in plot labels, such as “Sea water temperature [C]” for the key “temp”. They can be input from a stored file or as a dict.
ndatasets (int, optional) – Max number of datasets from each input catalog to use.
kwargs_map (dict, optional) – Keyword arguments to pass on to omsa.plot.map.plot_map call.
verbose (bool, optional) – Print useful runtime commands to stdout if True as well as save in log, otherwise silently save in log.
mode (str, optional) – mode for logging file. Default is to overwrite an existing logfile, but can be changed to other modes, e.g. “a” to instead append to an existing log file.
testing (boolean, optional) – Set to True if testing so warnings come through instead of being logged.
alpha (int) – parameter for alphashape. 0 returns qhull, and higher values make a tighter polygon around the points.
dd (int) – number to decimate model points by when calculating model boundary with alphashape. input 1 to not decimate.
preprocess (bool, optional) – If True, use function from extract_model to preprocess model output.
need_xgcm_grid (bool) – If True, try to set up xgcm grid for run, which will be used for the variable calculation for the model.
kwargs_xroms (dict) – Optional keyword arguments to pass to xroms.open_dataset
locstream (boolean, optional) –
Which type of interpolation to do, passed to em.select():
- False: 2D array of points with 1 dimension the lons and the other dimension the lats.
- True: lons/lats as unstructured coordinate pairs (in xESMF language, LocStream).
interpolate_horizontal (bool, optional) – If True, interpolate horizontally. Otherwise find nearest model points.
horizontal_interp_code (str) – Default “xesmf” to use package xESMF for horizontal interpolation, which is probably better if you need to interpolate to many points. To use xESMF you have install it as an optional dependency. Input “tree” to use BallTree to find nearest 3 neighbors and interpolate using barycentric coordinates. This has been tested for interpolating to 3 locations so far. Input “delaunay” to use a delaunay triangulation to find the nearest triangle points and interpolate the same as with “tree” using barycentric coordinates. This should be faster when you have more points to interpolate to, especially if you save and reuse the triangulation.
save_horizontal_interp_weights (bool) – Default True. Whether or not to save horizontal interp info like Delaunay triangulation to file. Set to False to not save which is useful for testing.
want_vertical_interp (bool) – This is False unless the user wants to specify that vertical interpolation should happen. This is used in only certain cases but in those cases it is important so that it is known to interpolate instead of try to figure out a vertical level index (which is not possible currently).
extrap (bool) – Passed to extract_model.select(). Defaults to False. Pass True to extrapolate outside the model domain.
model_source_name (str, optional) – Use this to access a specific source in the input model_catalog instead of otherwise just using the first source in the catalog.
catalog_source_names –
user_min_time (str, optional) – If this is input, it will be used as the min time for the model
user_max_time (str, optional) – If this is input, it will be used as the max time for the model
check_in_boundary (bool) – If True, station location will be compared against model domain polygon to check if inside domain. Set to False to skip this check which might be desirable if you want to just compare with the closest model point.
tidal_filtering (dict,) – tidal_filtering["model"]=True to tidally filter modeling output after em.select() is run, and tidal_filtering["data"]=True to tidally filter data.
ts_mods (list) – list of time series modifications to apply to data and model.
model_only (bool) – If True, reads in model output and saves to cache, then stops. Default False.
plot_map (bool) – If False, don’t plot map
no_Z (bool) – If True, set Z=None so no vertical interpolation or selection occurs. Do this if your variable has no concept of depth, like the sea surface height.
skip_mask (bool) – Allows user to override mask behavior and keep it as None. Good for testing. Default False. Also skips mask in p1 calculation and map plotting if set to False and those are set to True.
wetdry (bool) – If True, insist that masked used has “wetdry” in the name and then use the first time step of that mask.
plot_count_title (bool) – If True, have a count to match the map of the station number in the title, like “0: [station name]”. Otherwise skip count.
cache_dir (str, Path) – Pass on to omsa.paths to set cache directory location if you don’t want to use the default. Good for testing.
vocab_labels – dict with keys that match input vocab for putting labels with units on the plots. User has to make sure they match both the data and model; there is no unit handling.
return_fig (bool) – Set to True to return all outputs from this function. Use for testing. Only works if using a single source.
override_model (bool) – Flag to force-redo model selection. Default False.
override_processed (bool) – Flag to force-redo model and data processing. Default False.
override_stats (bool) – Flag to force-redo stats calculation. Default False.
override_plot (bool) – Flag to force-redo plot. If True, only redos plot itself if other files are already available. If False, only redos the plot not the other files. Default False.
kwargs_plot (dict) – to pass to omsa plot selection and then through the omsa plot selection to the subsequent plot itself for source. If you need more fine options, run the run function per source.
skip_key_variable_check (bool) – If True, don’t check for key_variable name being in catalog source metadata.