Python API

Contents

Python API#

statemodify offers a programmatic API in Python.

Note

For questions or request for support, please reach out to the development team. Your feedback is much appreciated in evolving this API!

Input Modification#

statemodify.modify_eva#

statemodify.modify_eva(modify_dict, query_field, output_dir, scenario, basin_name, sampling_method='LHS', n_samples=1, skip_rows=1, n_jobs=-1, seed_value=None, template_file=None, factor_method='add', data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0, save_sample=False, sample_array=None)[source]#

Modify StateMod net reservoir evaporation annual data file (.eva) using a Latin Hypercube Sample from the user.

Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds for sampling.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling

  • n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

  • n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.

  • seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5

  • max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0

  • save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.

  • sample_array (np.array) – Optionally provide array containing sample instead of generating it.

Returns:

None

Return type:

None

Example:

import statemodify as stm

# a dictionary to describe what you want to modify and the bounds for the LHS
setup_dict = {"ids": ["10001", "10004"], "bounds": [-0.5, 1.0]}

output_directory = "<your desired output directory>"
scenario = "<your scenario name>"

# the number of samples you wish to generate
n_samples = 4

# seed value for reproducibility if so desired
seed_value = None

# number of rows to skip in file after comment
skip_rows = 1

# name of field to query
query_field = "id"

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = -1

# basin to process
basin_name = "Upper_Colorado"

# generate a batch of files using generated LHS
stm.modify_eva(
    modify_dict=setup_dict,
    query_field=query_field,
    output_dir=output_directory,
    scenario=scenario,
    basin_name=basin_name,
    sampling_method="LHS",
    n_samples=n_samples,
    skip_rows=skip_rows,
    n_jobs=n_jobs,
    seed_value=seed_value,
    template_file=None,
    factor_method="add",
    data_specification_file=None,
    min_bound_value=-0.5,
    max_bound_value=1.0,
    save_sample=False,
)

statemodify.modify_single_eva#

statemodify.modify_single_eva(modify_dict, query_field, output_dir, scenario, basin_name, sample, sample_id=0, skip_rows=1, template_file=None, factor_method='add', data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0)[source]#

Modify StateMod net reservoir evaporation annual data file (.eva) using a Latin Hypercube Sample from the user.

Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds for each field name.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

  • sample (np.array) – An array of samples for each parameter.

  • sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise, the default template in this package will be used.

  • factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5

  • max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0

Returns:

None

Return type:

None

Example:

import statemodify as stm

# a dictionary to describe what you want to modify and the bounds for the LHS
setup_dict = {"ids": ["10001", "10004"], "bounds": [-0.5, 1.0]}

output_directory = "<your desired output directory>"
scenario = "<your scenario name>"

# sample id for the current run
sample_id = 0

# sample array for each parameter
sample = np.array([0.39])

# seed value for reproducibility if so desired
seed_value = None

# number of rows to skip in file after comment
skip_rows = 1

# name of field to query
query_field = "id"

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = -1

# basin to process
basin_name = "Upper_Colorado"

# generate a batch of files using generated LHS
stm.modify_single_eva(
    modify_dict=modify_dict,
    query_field=query_field,
    sample=sample,
    sample_id=sample_id,
    output_dir=output_dir,
    scenario=scenario,
    basin_name=basin_name,
    skip_rows=skip_rows,
    template_file=None,
    factor_method="add",
    data_specification_file=None,
    min_bound_value=-0.5,
    max_bound_value=1.0,
)

statemodify.modify_ddm#

statemodify.modify_ddm(modify_dict, query_field, output_dir, scenario, basin_name, sampling_method='LHS', n_samples=1, skip_rows=1, n_jobs=-1, seed_value=None, template_file=None, factor_method='multiply', data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5, save_sample=False, sample_array=None)[source]#

Parallel modification of StateMod municipal, industrial, transbasin Demands (.ddm).

Modifed using a Latin Hypercube Sample from the user. Samples are processed in parallel. Modification is targeted at ids to modify are specified in the modify_dict argument. The user must specify bounds by which the samples will be generated.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling

  • n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

  • n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.

  • seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: 0.5

  • max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.5

  • save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.

  • sample_array (np.array) – Optionally provide array containing sample instead of generating it.

Returns:

None

Return type:

None

Example:

import statemodify as stm

# a dictionary to describe what you want to modify and the bounds for the LHS
setup_dict = {"ids": ["3600507", "3600603"], "bounds": [0.5, 1.0]}

output_directory = "<your desired output directory>"
scenario = "<your scenario name>"

# the number of samples you wish to generate
n_samples = 4

# seed value for reproducibility if so desired
seed_value = None

# number of rows to skip in file after comment
skip_rows = 1

# name of field to query
query_field = "id"

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = -1

# basin to process
basin_name = "Upper_Colorado"

# generate a batch of files using generated LHS
stm.modify_ddm(
    modify_dict=setup_dict,
    query_field=query_field,
    output_dir=output_directory,
    scenario=scenario,
    basin_name=basin_name,
    sampling_method="LHS",
    n_samples=n_samples,
    skip_rows=skip_rows,
    n_jobs=n_jobs,
    seed_value=seed_value,
    template_file=None,
    factor_method="multiply",
    data_specification_file=None,
    min_bound_value=-0.5,
    max_bound_value=1.0,
    save_sample=False,
)

statemodify.modify_single_ddm#

statemodify.modify_single_ddm(modify_dict, query_field, output_dir, scenario, basin_name, sample, sample_id=0, skip_rows=1, template_file=None, factor_method='multiply', data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5)[source]#

Modify StateMod municipal, industrial, transbasin Demands (.ddm) using a sample from the user.

Samples are processed in parallel. Modification is targeted at ids to modify are specified in the modify_dict argument. The user must specify bounds by which the samples will be generated.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

  • sample (np.array) – An array of samples for each parameter.

  • sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: 0.5

  • max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.5

Returns:

None

Return type:

None

Example:

import statemodify as stm

# a dictionary to describe what you want to modify and the bounds for the LHS
setup_dict = {"ids": ["10001", "10004"], "bounds": [0.5, 1.0]}

output_directory = "<your desired output directory>"
scenario = "<your scenario name>"

# sample id for the current run
sample_id = 0

# sample array for each parameter
sample = np.array([0.59, 0.72])

# seed value for reproducibility if so desired
seed_value = None

# number of rows to skip in file after comment
skip_rows = 1

# name of field to query
query_field = "id"

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = -1

# basin to process
basin_name = "Upper_Colorado"

# generate a batch of files using generated LHS
stm.modify_single_ddm(
    modify_dict=setup_dict,
    query_field=query_field,
    sample=sample,
    sample_id=sample_id,
    output_dir=output_dir,
    scenario=scenario,
    basin_name=basin_name,
    skip_rows=skip_rows,
    template_file=None,
    factor_method="multiply",
    data_specification_file=None,
    min_bound_value=-0.5,
    max_bound_value=1.0,
)

statemodify.modify_ddr#

statemodify.modify_ddr(modify_dict, query_field, output_dir, scenario, basin_name, sampling_method='LHS', n_samples=1, skip_rows=0, n_jobs=-1, seed_value=None, template_file=None, factor_method='multiply', data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5, save_sample=False, sample_array=None)[source]#

Parallelized modification of StateMod water rights (.ddr) using a Latin Hypercube Sample from the user.

Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds to generate the sample.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling

  • n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

  • n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.

  • seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5

  • max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0

  • save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.

  • sample_array (np.array) – Optionally provide array containing sample instead of generating it.

Returns:

None

Return type:

None

Example:

import statemodify as stm

# a dictionary to describe what you want to modify and the bounds for the LHS
setup_dict = {
    # ids can either be 'struct' or 'id' values
    "ids": ["3600507.01", "3600507.02"],
    "bounds": [0.5, 1.0],
    # turn id on or off completely or for a given period
    # if 0 = off, 1 = on, YYYY = on for years >= YYYY, -YYYY = off for years > YYYY; see file header
    "on_off": [-1977, 1],
    # apply rank of administrative order where 0 is lowest (senior) and n is highest (junior); None is no change
    "admin": [[None, 2], [0, 1]],
    # optionally, pass a value that you want to assign for all ids; this overrides bounds
    "values": [0.7],
}

output_directory = "<your desired output directory>"
scenario = "<your scenario name>"

# the number of samples you wish to generate
n_samples = 4

# seed value for reproducibility if so desired
seed_value = None

# number of rows to skip in file after comment
skip_rows = 0

# name of field to query
query_field = "id"

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = -1

# basin to process
basin_name = "Upper_Colorado"

# generate a batch of files using generated LHS
stm.modify_ddr(
    modify_dict=setup_dict,
    query_field=query_field,
    output_dir=output_directory,
    scenario=scenario,
    basin_name=basin_name,
    sampling_method="LHS",
    n_samples=n_samples,
    skip_rows=skip_rows,
    n_jobs=n_jobs,
    seed_value=seed_value,
    template_file=None,
    factor_method="multiply",
    data_specification_file=None,
    min_bound_value=-0.5,
    max_bound_value=1.5,
    save_sample=False,
)

statemodify.modify_single_ddr#

statemodify.modify_single_ddr(modify_dict, query_field, output_dir, scenario, basin_name, sample=array([], dtype=float64), sample_id=0, skip_rows=0, template_file=None, factor_method='multiply', use_values=False, use_sampling=True, data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5)[source]#

Modify StateMod water rights (.ddr) file from sample provided by the user.

Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds if generating samples.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

  • sample (np.array) – An array of samples for each parameter.

  • sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise, the default template in this package will be used.

  • factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5

  • max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0

  • use_values (bool) – If values is present in the modify dictionary, use it instead of the sampler. Defaults to False.

  • use_sampling (bool) – If bounds are not present in the modify dictionary, then sampling will not be used. Defaults to False.

Returns:

None

Return type:

None

Example:

import statemodify as stm

# a dictionary to describe what you want to modify and the bounds for the LHS
setup_dict = {
    # ids can either be 'struct' or 'id' values
    "ids": ["3600507.01", "3600507.02"],
    "bounds": [0.5, 1.0],
    # turn id on or off completely or for a given period
    # if 0 = off, 1 = on, YYYY = on for years >= YYYY, -YYYY = off for years > YYYY; see file header
    "on_off": [-1977, 1],
    # apply rank of administrative order where 0 is lowest (senior) and n is highest (junior); None is no change
    "admin": [None, 0],
    # optionally, pass a value that you want to assign for all ids; this overrides bounds
    "values": [0.7],
}

output_directory = "<your desired output directory>"
scenario = "<your scenario name>"

# sample id for the current run
sample_id = 0

# sample array for each parameter
sample = np.array([0.39, -0.42])

# seed value for reproducibility if so desired
seed_value = None

# number of rows to skip in file after comment
skip_rows = 0

# name of field to query
query_field = "id"

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = -1

# basin to process
basin_name = "Upper_Colorado"

# generate a batch of files using generated LHS
stm.modify_single_ddr(
    modify_dict=setup_dict,
    query_field=query_field,
    sample=sample,
    sample_id=sample_id,
    output_dir=output_directory,
    scenario=scenario,
    basin_name=basin_name,
    skip_rows=skip_rows,
    template_file=None,
    use_values=False,
    use_sampling=True,
    factor_method="multiply",
    data_specification_file=None,
    min_bound_value=-0.5,
    max_bound_value=1.0,
    use_values=False,
)

statemodify.apply_on_off_modification#

statemodify.apply_on_off_modification(df, modify_dict, query_field)[source]#

Apply on_off modification as specified by the user. Used with the water demand (.ddr) modification.

Parameters:
  • df (pd.DataFrame) – Data frame of extracted content from the source file.

  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

Returns:

Modified input.

Return type:

pd.DataFrame

statemodify.apply_seniority_modification#

statemodify.apply_seniority_modification(df, modify_dict, query_field)[source]#

Apply seniority modification as specified by the user. Used with the water demand (.ddr) modification.

Parameters:
  • df (pd.DataFrame) – Data frame of extracted content from the source file.

  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.

  • query_field (str) – Field name to use for target query.

Returns:

Modified input.

Return type:

pd.DataFrame

statemodify.modify_xbm_iwr#

statemodify.modify_xbm_iwr(output_dir, flow_realizations_directory, scenario='', basin_name='Upper_Colorado', n_years=105, n_basins=5, xbm_skip_rows=1, iwr_skip_rows=1, xbm_template_file=None, iwr_template_file=None, xbm_data_specification_file=None, iwr_data_specification_file=None, months_in_year=12, seed_value=None, n_jobs=-1, n_samples=1, save_sample=False, randomly_select_flow_sample=True, desired_sample_number=None)[source]#

Generate flows for all samples for all basins in parallel to build modified XBM and IWR files.

Parameters:
  • output_dir (str) – Path to output directory.

  • flow_realizations_directory (str) – Full path to the directory containing the flow realization files for each sample. E.g., AnnualQ_s0.txt files produced by the ‘hmm_multisite_sample’ function.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • n_years (int) – number of years

  • n_basins (int) – number of basins in HMM inputs

  • xbm_skip_rows (int) – number of rows to skip in XBM template file

  • iwr_skip_rows (int) – number of rows to skip in IWR template file

  • xbm_template_file (Union[None, str]) – Template file to build XBM adjustment off of

  • iwr_template_file (Union[None, str]) – Template file to build IWR adjustment off of

  • xbm_data_specification_file (Union[None, str]) – Specification YAML file for XBM format

  • iwr_data_specification_file (Union[None, str]) – Specification YAML file for XBM format

  • months_in_year (int) – Number of months in year

  • seed_value (Union[None, int] = None)) – Integer to use for random seed or None if not desired

  • n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.

  • n_samples (int) – Used if generate_samples is True. Number of samples to generate.

  • save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.

  • randomly_select_flow_sample (bool) – Choice to randomly select a realization sample from the flow files directory.

  • desired_sample_number (Union[None, int]) – If ‘randomly_select_flow_sample’ is set to False, select a desired sample number as an integer

Example:

import statemodify as stm


output_directory = "<your desired output directory>"
flow_realizations_directory = (
    "<directory where the flow realization files are kept>"
)
scenario = "<your scenario name>"

# basin name to process
basin_name = "Upper_Colorado"

# seed value for reproducibility if so desired
seed_value = None

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = -1

# number of samples to generate
n_samples = 100

# generate a batch of files using generated LHS
stm.modify_xbm_iwr(
    output_dir=output_directory,
    flow_realizations_directory=flow_realizations_directory,
    scenario=scenario,
    basin_name=basin_name,
    seed_value=seed_value,
    n_jobs=n_jobs,
    n_samples=n_samples,
    save_sample=False,
    randomly_select_flow_sample=True,
)

statemodify.modify_xbm_iwr#

statemodify.modify_single_xbm_iwr(iwr_multiplier, flow_realizations_directory, output_dir, scenario='', basin_name='Upper_Colorado', sample_id=0, n_sites=208, n_years=105, xbm_skip_rows=1, iwr_skip_rows=1, xbm_template_file=None, iwr_template_file=None, xbm_data_specification_file=None, iwr_data_specification_file=None, historical_column=0, months_in_year=12, seed_value=None, randomly_select_flow_sample=True, desired_sample_number=None)[source]#

Generate synthetic streamflow data using a Hidden Markov Model (HMM).

Parameters:
  • flow_realizations_directory (str) – Full path to the directory containing the flow realization files for each sample. E.g., AnnualQ_s0.txt files produced by the ‘hmm_multisite_sample’ function.

  • iwr_multiplier (float) – Irrigation water requirement multiplier

  • sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • n_sites (int) – number of sites

  • n_years (int) – number of years

  • xbm_skip_rows (int) – number of rows to skip in XBM template file

  • iwr_skip_rows (int) – number of rows to skip in IWR template file

  • xbm_template_file (Union[None, str]) – Template file to build XBM adjustment off of

  • iwr_template_file (Union[None, str]) – Template file to build IWR adjustment off of

  • xbm_data_specification_file (Union[None, str]) – Specification YAML file for XBM format

  • iwr_data_specification_file (Union[None, str]) – Specification YAML file for XBM format

  • historical_column (int) – Index of year to use for historical data

  • months_in_year (int) – Number of months in year

  • seed_value (Union[None, int] = None)) – Integer to use for random seed or None if not desired

  • randomly_select_flow_sample (bool) – Choice to randomly select a realization sample from the flow files directory.

  • desired_sample_number (Union[None, int]) – If ‘randomly_select_flow_sample’ is set to False, select a desired sample number as an integer

statemodify.get_reservoir_structure_ids#

statemodify.get_reservoir_structure_ids(basin_name, template_file=None, data_specification_file=None)[source]#

Generate a list of structure ids that are in the input file.

Parameters:
  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

Returns:

List of structure ids

Return type:

List

statemodify.modify_single_res#

statemodify.modify_single_res(output_dir, scenario, basin_name, sample, sample_id=0, template_file=None, data_specification_file=None, target_structure_id_list=None, skip_rows=0)[source]#

Modify a single reservoir (.res) file based on a user provided sample.

Parameters:
  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • sample (np.array) – An array of samples for each parameter.

  • sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • target_structure_id_list (Union[None, List[str]]) – Structure id list to process. If None, all structure ids will be processed.

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

statemodify.modify_res#

statemodify.modify_res(output_dir, scenario, basin_name='Gunnison', template_file=None, data_specification_file=None, target_structure_id_list=None, skip_rows=0, seed_value=None, n_jobs=-1, n_samples=1, save_sample=False)[source]#

Modify a single reservoir (.res) file based on a user provided sample.

Parameters:
  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

  • sample (np.array) – An array of samples for each parameter.

  • sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.

  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.

  • target_structure_id_list (Union[None, List[str]]) – Structure id list to process. If None, all structure ids will be processed.

  • skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1

  • seed_value (Union[None, int] = None)) – Integer to use for random seed or None if not desired

  • n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.

  • n_samples (int) – Used if generate_samples is True. Number of samples to generate.

  • save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.

Example:

import statemodify as stm


output_directory = "<your desired output directory>"
scenario = "<your scenario name>"

# basin name to process
basin_name = "Gunnison"

# seed value for reproducibility if so desired
seed_value = 0

# number of jobs to launch in parallel; -1 is all but 1 processor used
n_jobs = 2

# number of samples to generate
n_samples = 2

stm.modify_res(
    output_dir=output_directory,
    scenario=scenario,
    basin_name=basin_name,
    target_structure_id_list=None,
    seed_value=seed_value,
    n_jobs=n_jobs,
    n_samples=n_samples,
    save_sample=False,
)

HMM Functions#

statemodify.hmm_multisite_fit#

statemodify.hmm_multisite_fit(n_basins=5, save_parameters=False, output_directory=None)[source]#

Fits a Hidden Markov Model (HMM) to multisite data.

Parameters:
  • n_basins (int, optional) – The number of basins to fit the model to. Defaults to 5.

  • save_parameters (bool, optional) – If True, saves the model parameters to the specified output directory. Defaults to False.

  • output_directory (Union[None, str], optional) – The directory where model parameters will be saved if save_parameters is True. If None, parameters are not saved. Defaults to None.

statemodify.hmm_multisite_sample#

statemodify.hmm_multisite_sample(logAnnualQ_h, transition_matrix, unconditional_dry, dry_state_means, wet_state_means, covariance_matrix_dry, covariance_matrix_wet, n_basins=5, n_alternatives=100, save_samples=True, output_directory=None)[source]#

Generate multisite samples for hydrological modeling using a Hidden Markov Model (HMM).

Parameters:
  • logAnnualQ_h (ndarray) – Historical log space annual flows.

  • transition_matrix (ndarray) – The transition matrix of the HMM.

  • unconditional_dry (float) – Unconditional probability of the dry state.

  • dry_state_means (ndarray) – Means of the log space flows in the dry state.

  • wet_state_means (ndarray) – Means of the log space flows in the wet state.

  • covariance_matrix_dry (ndarray) – Covariance matrix for the dry state.

  • covariance_matrix_wet (ndarray) – Covariance matrix for the wet state.

  • n_basins (int) – Number of basins to simulate. Defaults to 5.

  • n_alternatives (int) – Number of alternative sequences to generate. Defaults to 100.

  • save_samples (bool) – Whether to save the generated samples. Defaults to True.

  • output_directory (Union[None, str]) – Directory where samples should be saved. Required if save_samples is True.

Raises:

ValueError – If save_samples is True but output_directory is None.

statemodify.get_samples#

statemodify.get_samples(param_dict, basin_name, n_samples=1, sampling_method='LHS', seed_value=None)[source]#

Generate or load Latin Hypercube Samples (LHS).

Currently, this reads in an input file of precalculated samples. This should happen on-demand and be given a seed value if reproducibility is needed.

Parameters:
  • sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling

  • n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.

  • seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.

statemodify.generate_dry_state_means#

statemodify.generate_dry_state_means()[source]#

Generate or load dry state means.

Currently, this reads in an input file of precalculated samples. This should happen on-demand and be given a seed value if reproducibility is needed.

statemodify.generate_wet_state_means#

statemodify.generate_wet_state_means(load=True)[source]#

Generate or load wet state means.

Currently, this reads in an input file of precalculated samples. This should happen on-demand and be given a seed value if reproducibility is needed.

statemodify.generate_dry_covariance_matrix#

statemodify.generate_dry_covariance_matrix(load=True)[source]#

Generate or load dry covariance matrix.

Currently, this reads in an input file of precalculated samples. This should happen on-demand and be given a seed value if reproducibility is needed.

statemodify.generate_wet_covariance_matrix#

statemodify.generate_wet_covariance_matrix(load=True)[source]#

Generate or load wet covariance matrix.

Currently, this reads in an input file of precalculated samples. This should happen on-demand and be given a seed value if reproducibility is needed.

statemodify.generate_transition_matrix#

statemodify.generate_transition_matrix(load=True)[source]#

Generate or load transition matrix.

Currently, this reads in an input file of precalculated samples. This should happen on-demand and be given a seed value if reproducibility is needed.

statemodify.calculate_array_monthly#

statemodify.calculate_array_monthly(df, value_fields, year_field='year')[source]#

Create an array of month rows and site columns for each year in the data frame.

Parameters:
  • df (pd.DataFrame) – Input template data frame

  • value_fields (List[str]) – Month fields in data

  • year_field (str) – Name of the year field

Returns:

Array of value for month, site

Return type:

np.array

statemodify.calculate_array_annual#

statemodify.calculate_array_annual(monthly_arr)[source]#

Calculate annual values.

Parameters:

monthly_arr (np.array) – Array of monthly xbm values

Returns:

Array of year, site

Return type:

np.array

statemodify.calculate_annual_sum#

statemodify.calculate_annual_sum(arr, axis=1)[source]#

Calculate annual sum of input array.

Parameters:
  • arr (np.array) – Input 2D array where year, site

  • axis (int) – Axis to sum over. Default: 1

Returns:

Array of sums

Return type:

np.array

statemodify.calculate_annual_mean_fractions#

statemodify.calculate_annual_mean_fractions(arr_annual, arr_sum)[source]#

Calculate annual mean fractions of total values.

Parameters:
  • arr_annual (np.array) – Array of annual values per site

  • arr_sum (np.array) – Array of annual sums

Returns:

Array of fractions

Return type:

np.array

statemodify.fit_iwr_model#

statemodify.fit_iwr_model(xbm_data_array_annual, iwr_data_array_annual)[source]#

Model annual irrigation demand anomaly as a function of annual flow anomaly at last node.

Parameters:
  • xbm_data_array_annual (np.array) – Annual flow from XBM

  • iwr_data_array_annual (np.array) – Annual data from IWR

statemodify.generate_hmm_inputs#

statemodify.generate_hmm_inputs(template_file, n_basins=5)[source]#

Generate HMM input files for all basins.

statemodify.generate_flows#

statemodify.generate_flows(dry_state_means, wet_state_means, covariance_matrix_dry, covariance_matrix_wet, transition_matrix, mu_0, sigma_0, mu_1, sigma_1, p00, p11, n_basins=5, n_years=105, seed_value=None)[source]#

Generate synthetic streamflow data using a Hidden Markov Model (HMM).

Parameters:
  • dry_state_means (np.array) – mean streamflow values for dry state

  • wet_state_means (np.array) – mean streamflow values for wet state

  • covariance_matrix_dry (np.array) – covariance matrix for dry state

  • covariance_matrix_wet (np.array) – covariance matrix for wet state

  • transition_matrix (np.array) – transition matrix for HMM

  • mu_0 (float) – mean multiplier for dry state

  • sigma_0 (float) – covariance multiplier for dry state

  • mu_1 (float) – mean multiplier for wet state

  • sigma_1 (float) – covariance multiplier for wet state

  • p00 (float) – transition matrix multiplier for dry state

  • p11 (float) – transition matrix multiplier for wet state

  • n_basins (int) – number of sites to generate data for

  • n_years (int) – number of years to generate data for

  • seed_value (Union[None, int]) – random seed value

Returns:

synthetic streamflow data

Return type:

np.array

statemodify.generate_modified_file#

statemodify.generate_modified_file(source_object, monthly_data_array, output_dir, scenario, sample_id=0)[source]#

Generate modified template data frame.

Parameters:
  • source_object (object) – Instantiated object containing the source data and file specifics.

  • monthly_data_array (np.array) – Array of monthly data per year and site matching the shape of the input value columns.

  • sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.

  • output_dir (str) – Path to output directory.

  • scenario (str) – Scenario name.

Output Modification#

statemodify.convert_xdd#

statemodify.convert_xdd(*, output_path='./output', allow_overwrite=False, xdd_files='**/*.xdd', id_subset=None, parallel_jobs=4, preserve_string_dtype=True)[source]#

Convert StateMod output .xdd files to compressed, columnar .parquet files.

Easily interoperate with pandas dataframes.

Parameters:
  • output_path (str) – Path to a folder where outputs should be written; default “./output”

  • allow_overwrite (bool) – If False, abort if files already exist in the output_path; default False

  • xdd_files (List[str]) – File(s) or glob(s) to the .xdd files to convert; default “**/*.xdd”

  • id_subset (List[str]) – List of structure IDs to convert, or None for all; default None

  • parallel_jobs (int) – How many files to process in parallel; default 4

  • preserve_string_dtype (bool) – Keep string parsed data instead of casting to actual type; default True

Returns:

None

Return type:

None

Example:

import statemodify as stm

stm.xdd.convert_xdd(
    # path to a directory where output .parquet files should be written
    output_path="./output",
    # whether to abort if .parquet files already exist at the output_path
    allow_overwrite=False,
    # path, glob, or a list of paths/globs to the .xdd files you want to convert
    xdd_files="**/*.xdd",
    # if the output .parquet files should only contain a subset of structure ids, list them here; None for all
    id_subset=None,
    # how many .xdd files to convert in paralllel; optimally you will want 2-4 CPUs per parallel process
    parallel_jobs=4,
)

# look for your output .parquet files at the output_path!

statemodify.read_xre#

statemodify.read_xre(xre_path, reservoir_name)[source]#

Read xre files generated by statemod using HMM data and historical data.

Parameters:
  • xre_path (str) – str, path to xre files

  • reservoir_name (str) – str, name of reservoir that begins XRE file

Returns:

a numpy array with reservoir storage across all hmm realizations, a numpy array with reservoir storage from the historical record a numpy array with monthly means of storage from the hist record a numpy array with monthly 1st percentiles of storage of the hist record

statemodify.extract_xre_data#

statemodify.extract_xre_data(structure_name, structure_id, input_file=None, basin_name=None, write_csv=False, write_parquet=False, output_directory=None)[source]#

Extract a single reservoir from a raw .xre file in to a Pandas data frame.

Optionally save output to a CSV or Parquet file.

Parameters:
  • structure_name (str) – str, name of the reservoir

  • structure_id (str) – str, structure ID for reservoir of interest

  • input_file (Optional[str]) – Union[None, str], path to the xre file

  • basin_name (Optional[str]) – Union[None, str], Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • write_csv (bool) – bool, whether to write output to a CSV file

  • write_parquet (bool) – bool, whether to write output to a Parquet file

  • output_directory (Optional[str]) – Union[str, None], path to the output directory

Return type:

DataFrame

Returns:

pd.DataFrame, a Pandas data frame containing the extracted reservoir data

Example:

..code-block::python

xre_file = ‘<path_to_file>/gm2015H.xre’ # path the the xre file structure_ID = ‘2803590’ # structure ID for reservoir of interest structure_name = ‘Blue_Mesa’ # name of the reservoir

df = extract_xre_data(structure_name=structure_name,

structure_id=structure_ID, input_file=xre_file, output_directory=None, write_csv=None, write_parquet=None

)

Sampling#

statemodify.build_problem_dict#

statemodify.build_problem_dict(modify_dict, fill=False)[source]#

Build the problem set from the input modify dictionary provided by the user.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler.

  • fill (bool) – If True, fill in missing names using the index of the ids list. Default False.

Returns:

Dictionary ready for use by SALib sample generators

Return type:

dict

Example:

import statemodify as stm

# a dictionary to describe what you want to modify and the bounds for the LHS
setup_dict = {"ids": ["10001", "10004"], "bounds": [-1.0, 1.0]}

# generate problem dictionary to use with SALib sampling components
problem_dict = stm.build_problem_dict(modify_dict, fill=False)

statemodify.generate_samples#

statemodify.generate_samples(problem_dict, n_samples=1, sampling_method='LHS', seed_value=None)[source]#

Generate an array of statistically generated samples.

Parameters:
  • problem_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler.

  • sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling

  • n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.

  • seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.

Returns:

Array of samples

Return type:

np.array

statemodify.validate_modify_dict#

statemodify.validate_modify_dict(modify_dict, required_keys=('ids',), fill=False)[source]#

Validate user input modify dictionary to ensure all necessary elements are present.

Parameters:
  • modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler.

  • required_keys (Tuple[str]) – Keys required to be present in the input dictionary.

  • fill (bool) – If True, fill in missing names using the index of the ids list. Default False.

Returns:

Dictionary of validated parameters

Return type:

dict

statemodify.generate_sample_iwr#

statemodify.generate_sample_iwr(n_samples=1, sampling_method='LHS', seed_value=None)[source]#

Generate samples for the IWR multiplier.

Parameters:
  • sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling

  • n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.

  • seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.

statemodify.generate_sample_all_params#

statemodify.generate_sample_all_params(n_samples=1, sampling_method='LHS', seed_value=None)[source]#

Generate samples for all parameters.

Parameters:
  • sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling

  • n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.

  • seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.

Modification#

statemodify.set_alignment#

statemodify.set_alignment(value, n_spaces=0, align='left')[source]#

Set left or right alignment.

Parameters:
  • value (str) – Value to evaluate.

  • n_spaces (int) – Number of spaces to buffer the value by. If less than the length of the value, no spaces will be added to the padding.

  • align (str) – Either ‘left’ or ‘right’ alignment for the value.

Return type:

str

Returns:

Value with string padding.

statemodify.pad_with_spaces#

statemodify.pad_with_spaces(value, expected_width, align='left')[source]#

Pad a string with the number of spaces specified by the user.

Parameters:
  • value (str) – Value to evaluate.

  • expected_width (int) – Expected width of the field.

  • align (str) – Either ‘left’ or ‘right’ alignment for the value.

Return type:

str

Returns:

Value with string padding.

statemodify.add_zero_padding#

statemodify.add_zero_padding(x, precision=2)[source]#

Fields may expect zero padding that gets rounded off by pandas. This method adds that back in.

Parameters:
  • x (str) – Float value from file that is represented as a string.

  • precision (int) – Precision to account for.

Return type:

str

Returns:

Zero padded string.

statemodify.populate_dict#

statemodify.populate_dict(line, field_dict, column_widths, column_list, data_types, replace_dict)[source]#

Populate the input dictionary with values from each line based on column widths.

Parameters:
  • line (str) – Line of data as a string from the input file.

  • field_dict (dict) – Dictionary holding values for each field.

  • column_widths (dict) – Dictionary of column names to expected widths.

  • column_list (list) – List of columns to process.

  • data_types (dict) – Dictionary of column names to data types.

  • replace_dict (dict) – Dictionary of value with replacement when necessary. For example, {”****”, “”}

Return type:

dict

Returns:

Populated data dictionary.

statemodify.prep_data#

statemodify.prep_data(field_dict, template_file, column_list, column_widths, data_types, comment='#', skip_rows=0, replace_dict={})[source]#

Ingest statemod file and format into a data frame.

Parameters:
  • field_dict (dict) – Dictionary holding values for each field.

  • template_file (str) – Statemod input file to parse.

  • column_widths (dict) – Dictionary of column names to expected widths.

  • column_list (list) – List of columns to process.

  • data_types (dict) – Dictionary of column names to data types.

  • comment (str) – Characters leading string indicating ignoring a line.

  • skip_rows (int) – The number of uncommented rows of data to skip.

  • replace_dict (dict) – Dictionary of value with replacement when necessary. For example, {”****”, “”}

Returns:

[0] data frame of data from file [1] header data from file

statemodify.construct_outfile_name#

statemodify.construct_outfile_name(template_file, output_directory, scenario, sample_id)[source]#

Construct output file name from input template.

Parameters:
  • template_file (str) – Statemod input file to parse.

  • output_directory (str) – Output directory to save outputs to.

  • scenario (str) – Scenario name.

  • sample_id (int) – ID of sample.

Return type:

str

Returns:

Full path with file name and extension for the modified output file.

statemodify.construct_data_string#

statemodify.construct_data_string(df, column_list, column_widths, column_alignment)[source]#

Format line and construct data string.

Parameters:
  • df (pd.DataFrame) – ID of sample.

  • column_widths (dict) – Dictionary of column names to expected widths.

  • column_list (list) – List of columns to process.

  • column_alignment (dict) – Dictionary of column names to their expected alignment (e.g., right, left).

Return type:

str

Returns:

Formatted data string.

statemodify.apply_adjustment_factor#

statemodify.apply_adjustment_factor(data_df, value_columns, query_field, target_ids, factor, factor_method='add')[source]#

Apply adjustment to template file values for target ids using a sample factor.

Parameters:
  • data_df (pd.DataFrame) – Data frame of data content from file.

  • value_columns (list) – Value columns that may be modified.

  • query_field (str) – Field name to conduct queries for.

  • target_ids (list) – Ids associated in query field to modify.

  • factor (float) – Value to multiply the selected value columns by.

  • factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’, ‘assign’. Defaults to ‘add’.

Returns:

A data frame of modified values to replace the original data with.

Return type:

pd.DataFrame

statemodify.validate_bounds#

statemodify.validate_bounds(bounds_list, min_value=-0.5, max_value=1.0)[source]#

Ensure sample bounds provided by the user conform to a feasible range of values in feet per month.

Parameters:
  • bounds_list (List[List[float]]) – List of bounds to use for each parameter.

  • min_value (float) – Minimum value of bounds that is feasible. Default -0.5.

  • max_value (float) – Maximum value of bounds that is feasible. Default 1.0.

Returns:

None

Return type:

None

Example:

import statemodify as stm

# list of bounds to use for each parameter
bounds_list = [[-0.5, 1.0], [-0.5, 1.0]]

# validate bounds
stm.validate_bounds(bounds_list=bounds_list, min_value=-0.5, max_value=1.0)

statemodify.Modify#

statemodify.Modify(comment_indicator, data_dict, column_widths, column_alignment, data_types, column_list, value_columns)[source]#

Modification template for data transformation.

Parameters:
  • comment_indicator (str) – Character(s) indicating the line in the file is a comment. Defualt “#”

  • data_dict (Dict[str, List[None]]) – Field specification in the file as a dictionary to hold the values for each field

  • column_widths (Dict[str, int]) – Column widths for the output file as a dictionary.

  • column_alignment (Dict[str, str]) – Expected column alignment.

  • data_types (Dict[str, type]) – Expected data types for each field

  • column_list (List[str]) – List of columns to process.

  • value_columns (List[str]) – List of columns that may be modified.

Batch Modification#

statemodify.get_required_arguments#

statemodify.get_required_arguments(fn)[source]#

Get all required arguments for a function as a list.

Return type:

list

statemodify.get_arguments_values#

statemodify.get_arguments_values(fn)[source]#

Get all arguments and their values as a dictionary.

Return type:

dict

statemodify.generate_parameters#

statemodify.generate_parameters(problem_dict)[source]#

Validate and generate parameters needed for the desired functions.

Return type:

dict

statemodify.modify_batch#

statemodify.modify_batch(problem_dict)[source]#

Run multiple modification functions from the same problem set and unified sample.

Parameters:

problem_dict (dict) – Dictionary of options for running multiple functions in batch mode. Used to generate the sample and pass other options to participating functions. See the following example.

Returns:

Dictionary of settings and samples for each participating funtion.

Return type:

dict

Example:

import statemodify as stm

# variables that apply to multiple functions
output_dir = "<your_desired_directory>"
basin_name = "Upper_Colorado"
scenario = "1"
seed_value = 77

# problem dictionary
problem_dict = {
    "n_samples": 2,
    "num_vars": 3,
    "names": ["modify_eva", "modify_ddr", "modify_ddm"],
    "bounds": [[-0.5, 1.0], [0.5, 1.0], [0.5, 1.0]],
    # additional settings for each function
    "modify_eva": {
        "seed_value": seed_value,
        "output_dir": output_dir,
        "scenario": scenario,
        "basin_name": basin_name,
        "query_field": "id",
        "ids": ["10001", "10004"],
    },
    "modify_ddr": {
        "seed_value": seed_value,
        "output_dir": output_dir,
        "scenario": scenario,
        "basin_name": basin_name,
        "query_field": "id",
        "ids": ["3600507.01", "3600507.02"],
        "admin": [None, 0],
        "on_off": [-1977, 1],
    },
    "modify_ddm": {
        "seed_value": seed_value,
        "output_dir": output_dir,
        "scenario": scenario,
        "basin_name": basin_name,
        "query_field": "id",
        "ids": ["3600507", "3600603"],
    },
}

# run in batch
fn_parameter_dict = stm.modify_batch(problem_dict=problem_dict)

Visualization#

statemodify.plot_flow_duration_curves#

statemodify.plot_flow_duration_curves(flow_realizations_directory, save_figure=False, output_directory=None, figure_name=None, dpi=300)[source]#

Plot flow duration curves for historical and synthetic data.

Parameters:

flow_realizations_directory (str) – Full path to the directory containing the flow realization files for each sample. E.g., AnnualQ_s0.txt files produced by the ‘hmm_multisite_sample’ function.

statemodify.plot_res_quantiles#

statemodify.plot_res_quantiles(hmm_data, historical_mean, reservoir_name, save_figure=False, output_directory=None, dpi=300)[source]#

Plot quantiles of the HMM data and the mean and 1st percentile of the historical record.

Parameters:
  • hmm_data (array) – a numpy array with monthly storage data from HMM realizations

  • historical_mean (array) – a numpy array with the historical mean monthly storage

  • res_name – a string with the name of the reservoir

statemodify.plot_reservoir_boxes#

statemodify.plot_reservoir_boxes(hmm_data, historical_data, reservoir_name, save_figure=False, output_directory=None, dpi=300)[source]#

Generate a boxplot comparing the historical record to the HMM realizations.

Parameters:
  • hmm_data (array) – a numpy array with monthly storage data from HMM realizations

  • historical_data (array) – a numpy array with monthly storage data from hist data

  • reservoir_name (str) – a string with the name of the reservoir

Utilities#

statemodify.yaml_to_dict#

statemodify.yaml_to_dict(yaml_file)[source]#

Read in a YAML file and convert to a typed dictionary.

NOTE: code can be executed from the YAML file due to the use of UnsafeLoader.

Parameters:

yaml_file (str) – Full path with file name and extension to the input YAML file.

Returns:

Dictionary of typed elements of the YAML file.

Return type:

dict

statemodify.select_template_file#

statemodify.select_template_file(basin_name, template_file, extension=None)[source]#

Select either the default template file or a user provided one.

Parameters:
  • basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White

  • template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.

  • extension (Union[None, str]) – Extension of the target template file with no dot.

Returns:

Template file path

Return type:

str

statemodify.select_data_specification_file#

statemodify.select_data_specification_file(yaml_file, extension=None)[source]#

Select either the default template file or a user provided one.

Parameters:
  • yaml_file (Union[None, str]) – If a full path to a YAML file is provided it will be used. Otherwise the default file in this package will be used.

  • extension (Union[None, str]) – Extension of the target template file with no dot.

Returns:

Template file path

Return type:

str