Python API#
statemodify offers a programmatic API in Python.
Note
For questions or request for support, please reach out to the development team. Your feedback is much appreciated in evolving this API!
Input Modification#
statemodify.modify_eva#
- statemodify.modify_eva(modify_dict, query_field, output_dir, scenario, basin_name, sampling_method='LHS', n_samples=1, skip_rows=1, n_jobs=-1, seed_value=None, template_file=None, factor_method='add', data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0, save_sample=False, sample_array=None)[source]#
Modify StateMod net reservoir evaporation annual data file (.eva) using a Latin Hypercube Sample from the user.
Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds for sampling.
- Parameters:
modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.
query_field (str) – Field name to use for target query.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling
n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.
seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5
max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0
save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.
sample_array (np.array) – Optionally provide array containing sample instead of generating it.
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm # a dictionary to describe what you want to modify and the bounds for the LHS setup_dict = {"ids": ["10001", "10004"], "bounds": [-0.5, 1.0]} output_directory = "<your desired output directory>" scenario = "<your scenario name>" # the number of samples you wish to generate n_samples = 4 # seed value for reproducibility if so desired seed_value = None # number of rows to skip in file after comment skip_rows = 1 # name of field to query query_field = "id" # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = -1 # basin to process basin_name = "Upper_Colorado" # generate a batch of files using generated LHS stm.modify_eva( modify_dict=setup_dict, query_field=query_field, output_dir=output_directory, scenario=scenario, basin_name=basin_name, sampling_method="LHS", n_samples=n_samples, skip_rows=skip_rows, n_jobs=n_jobs, seed_value=seed_value, template_file=None, factor_method="add", data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0, save_sample=False, )
statemodify.modify_single_eva#
- statemodify.modify_single_eva(modify_dict, query_field, output_dir, scenario, basin_name, sample, sample_id=0, skip_rows=1, template_file=None, factor_method='add', data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0)[source]#
Modify StateMod net reservoir evaporation annual data file (.eva) using a Latin Hypercube Sample from the user.
Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds for each field name.
- Parameters:
modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.
query_field (str) – Field name to use for target query.
sample (np.array) – An array of samples for each parameter.
sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise, the default template in this package will be used.
factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5
max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm # a dictionary to describe what you want to modify and the bounds for the LHS setup_dict = {"ids": ["10001", "10004"], "bounds": [-0.5, 1.0]} output_directory = "<your desired output directory>" scenario = "<your scenario name>" # sample id for the current run sample_id = 0 # sample array for each parameter sample = np.array([0.39]) # seed value for reproducibility if so desired seed_value = None # number of rows to skip in file after comment skip_rows = 1 # name of field to query query_field = "id" # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = -1 # basin to process basin_name = "Upper_Colorado" # generate a batch of files using generated LHS stm.modify_single_eva( modify_dict=modify_dict, query_field=query_field, sample=sample, sample_id=sample_id, output_dir=output_dir, scenario=scenario, basin_name=basin_name, skip_rows=skip_rows, template_file=None, factor_method="add", data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0, )
statemodify.modify_ddm#
- statemodify.modify_ddm(modify_dict, query_field, output_dir, scenario, basin_name, sampling_method='LHS', n_samples=1, skip_rows=1, n_jobs=-1, seed_value=None, template_file=None, factor_method='multiply', data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5, save_sample=False, sample_array=None)[source]#
Parallel modification of StateMod municipal, industrial, transbasin Demands (.ddm).
Modifed using a Latin Hypercube Sample from the user. Samples are processed in parallel. Modification is targeted at ids to modify are specified in the modify_dict argument. The user must specify bounds by which the samples will be generated.
- Parameters:
modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.
query_field (str) – Field name to use for target query.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling
n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.
seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: 0.5
max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.5
save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.
sample_array (np.array) – Optionally provide array containing sample instead of generating it.
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm # a dictionary to describe what you want to modify and the bounds for the LHS setup_dict = {"ids": ["3600507", "3600603"], "bounds": [0.5, 1.0]} output_directory = "<your desired output directory>" scenario = "<your scenario name>" # the number of samples you wish to generate n_samples = 4 # seed value for reproducibility if so desired seed_value = None # number of rows to skip in file after comment skip_rows = 1 # name of field to query query_field = "id" # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = -1 # basin to process basin_name = "Upper_Colorado" # generate a batch of files using generated LHS stm.modify_ddm( modify_dict=setup_dict, query_field=query_field, output_dir=output_directory, scenario=scenario, basin_name=basin_name, sampling_method="LHS", n_samples=n_samples, skip_rows=skip_rows, n_jobs=n_jobs, seed_value=seed_value, template_file=None, factor_method="multiply", data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0, save_sample=False, )
statemodify.modify_single_ddm#
- statemodify.modify_single_ddm(modify_dict, query_field, output_dir, scenario, basin_name, sample, sample_id=0, skip_rows=1, template_file=None, factor_method='multiply', data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5)[source]#
Modify StateMod municipal, industrial, transbasin Demands (.ddm) using a sample from the user.
Samples are processed in parallel. Modification is targeted at ids to modify are specified in the modify_dict argument. The user must specify bounds by which the samples will be generated.
- Parameters:
modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.
query_field (str) – Field name to use for target query.
sample (np.array) – An array of samples for each parameter.
sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: 0.5
max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.5
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm # a dictionary to describe what you want to modify and the bounds for the LHS setup_dict = {"ids": ["10001", "10004"], "bounds": [0.5, 1.0]} output_directory = "<your desired output directory>" scenario = "<your scenario name>" # sample id for the current run sample_id = 0 # sample array for each parameter sample = np.array([0.59, 0.72]) # seed value for reproducibility if so desired seed_value = None # number of rows to skip in file after comment skip_rows = 1 # name of field to query query_field = "id" # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = -1 # basin to process basin_name = "Upper_Colorado" # generate a batch of files using generated LHS stm.modify_single_ddm( modify_dict=setup_dict, query_field=query_field, sample=sample, sample_id=sample_id, output_dir=output_dir, scenario=scenario, basin_name=basin_name, skip_rows=skip_rows, template_file=None, factor_method="multiply", data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0, )
statemodify.modify_ddr#
- statemodify.modify_ddr(modify_dict, query_field, output_dir, scenario, basin_name, sampling_method='LHS', n_samples=1, skip_rows=0, n_jobs=-1, seed_value=None, template_file=None, factor_method='multiply', data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5, save_sample=False, sample_array=None)[source]#
Parallelized modification of StateMod water rights (.ddr) using a Latin Hypercube Sample from the user.
Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds to generate the sample.
- Parameters:
modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.
query_field (str) – Field name to use for target query.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling
n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.
seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5
max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0
save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.
sample_array (np.array) – Optionally provide array containing sample instead of generating it.
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm # a dictionary to describe what you want to modify and the bounds for the LHS setup_dict = { # ids can either be 'struct' or 'id' values "ids": ["3600507.01", "3600507.02"], "bounds": [0.5, 1.0], # turn id on or off completely or for a given period # if 0 = off, 1 = on, YYYY = on for years >= YYYY, -YYYY = off for years > YYYY; see file header "on_off": [-1977, 1], # apply rank of administrative order where 0 is lowest (senior) and n is highest (junior); None is no change "admin": [[None, 2], [0, 1]], # optionally, pass a value that you want to assign for all ids; this overrides bounds "values": [0.7], } output_directory = "<your desired output directory>" scenario = "<your scenario name>" # the number of samples you wish to generate n_samples = 4 # seed value for reproducibility if so desired seed_value = None # number of rows to skip in file after comment skip_rows = 0 # name of field to query query_field = "id" # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = -1 # basin to process basin_name = "Upper_Colorado" # generate a batch of files using generated LHS stm.modify_ddr( modify_dict=setup_dict, query_field=query_field, output_dir=output_directory, scenario=scenario, basin_name=basin_name, sampling_method="LHS", n_samples=n_samples, skip_rows=skip_rows, n_jobs=n_jobs, seed_value=seed_value, template_file=None, factor_method="multiply", data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.5, save_sample=False, )
statemodify.modify_single_ddr#
- statemodify.modify_single_ddr(modify_dict, query_field, output_dir, scenario, basin_name, sample=array([], dtype=float64), sample_id=0, skip_rows=0, template_file=None, factor_method='multiply', use_values=False, use_sampling=True, data_specification_file=None, min_bound_value=0.5, max_bound_value=1.5)[source]#
Modify StateMod water rights (.ddr) file from sample provided by the user.
Samples are processed in parallel. Modification is targeted at ids chosen by the user to modify and specified in the modify_dict argument. The user must specify bounds if generating samples.
- Parameters:
modify_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler. See following example.
query_field (str) – Field name to use for target query.
sample (np.array) – An array of samples for each parameter.
sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise, the default template in this package will be used.
factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’. Defaults to ‘add’.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
min_bound_value (float) – Minimum feasible sampling bounds in feet per month. Minimum allowable value: -0.5
max_bound_value (float) – Maximum feasible sampling bounds in feet per month. Maximum allowable value: 1.0
use_values (bool) – If values is present in the modify dictionary, use it instead of the sampler. Defaults to False.
use_sampling (bool) – If bounds are not present in the modify dictionary, then sampling will not be used. Defaults to False.
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm # a dictionary to describe what you want to modify and the bounds for the LHS setup_dict = { # ids can either be 'struct' or 'id' values "ids": ["3600507.01", "3600507.02"], "bounds": [0.5, 1.0], # turn id on or off completely or for a given period # if 0 = off, 1 = on, YYYY = on for years >= YYYY, -YYYY = off for years > YYYY; see file header "on_off": [-1977, 1], # apply rank of administrative order where 0 is lowest (senior) and n is highest (junior); None is no change "admin": [None, 0], # optionally, pass a value that you want to assign for all ids; this overrides bounds "values": [0.7], } output_directory = "<your desired output directory>" scenario = "<your scenario name>" # sample id for the current run sample_id = 0 # sample array for each parameter sample = np.array([0.39, -0.42]) # seed value for reproducibility if so desired seed_value = None # number of rows to skip in file after comment skip_rows = 0 # name of field to query query_field = "id" # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = -1 # basin to process basin_name = "Upper_Colorado" # generate a batch of files using generated LHS stm.modify_single_ddr( modify_dict=setup_dict, query_field=query_field, sample=sample, sample_id=sample_id, output_dir=output_directory, scenario=scenario, basin_name=basin_name, skip_rows=skip_rows, template_file=None, use_values=False, use_sampling=True, factor_method="multiply", data_specification_file=None, min_bound_value=-0.5, max_bound_value=1.0, use_values=False, )
statemodify.apply_on_off_modification#
statemodify.apply_seniority_modification#
statemodify.modify_xbm_iwr#
- statemodify.modify_xbm_iwr(output_dir, flow_realizations_directory, scenario='', basin_name='Upper_Colorado', n_years=105, n_basins=5, xbm_skip_rows=1, iwr_skip_rows=1, xbm_template_file=None, iwr_template_file=None, xbm_data_specification_file=None, iwr_data_specification_file=None, months_in_year=12, seed_value=None, n_jobs=-1, n_samples=1, save_sample=False, randomly_select_flow_sample=True, desired_sample_number=None)[source]#
Generate flows for all samples for all basins in parallel to build modified XBM and IWR files.
- Parameters:
output_dir (str) – Path to output directory.
flow_realizations_directory (str) – Full path to the directory containing the flow realization files for each sample. E.g., AnnualQ_s0.txt files produced by the ‘hmm_multisite_sample’ function.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
n_years (int) – number of years
n_basins (int) – number of basins in HMM inputs
xbm_skip_rows (int) – number of rows to skip in XBM template file
iwr_skip_rows (int) – number of rows to skip in IWR template file
xbm_template_file (Union[None, str]) – Template file to build XBM adjustment off of
iwr_template_file (Union[None, str]) – Template file to build IWR adjustment off of
xbm_data_specification_file (Union[None, str]) – Specification YAML file for XBM format
iwr_data_specification_file (Union[None, str]) – Specification YAML file for XBM format
months_in_year (int) – Number of months in year
seed_value (Union[None, int] = None)) – Integer to use for random seed or None if not desired
n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.
n_samples (int) – Used if generate_samples is True. Number of samples to generate.
save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.
randomly_select_flow_sample (bool) – Choice to randomly select a realization sample from the flow files directory.
desired_sample_number (Union[None, int]) – If ‘randomly_select_flow_sample’ is set to False, select a desired sample number as an integer
- Example:
import statemodify as stm output_directory = "<your desired output directory>" flow_realizations_directory = ( "<directory where the flow realization files are kept>" ) scenario = "<your scenario name>" # basin name to process basin_name = "Upper_Colorado" # seed value for reproducibility if so desired seed_value = None # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = -1 # number of samples to generate n_samples = 100 # generate a batch of files using generated LHS stm.modify_xbm_iwr( output_dir=output_directory, flow_realizations_directory=flow_realizations_directory, scenario=scenario, basin_name=basin_name, seed_value=seed_value, n_jobs=n_jobs, n_samples=n_samples, save_sample=False, randomly_select_flow_sample=True, )
statemodify.modify_xbm_iwr#
- statemodify.modify_single_xbm_iwr(iwr_multiplier, flow_realizations_directory, output_dir, scenario='', basin_name='Upper_Colorado', sample_id=0, n_sites=208, n_years=105, xbm_skip_rows=1, iwr_skip_rows=1, xbm_template_file=None, iwr_template_file=None, xbm_data_specification_file=None, iwr_data_specification_file=None, historical_column=0, months_in_year=12, seed_value=None, randomly_select_flow_sample=True, desired_sample_number=None)[source]#
Generate synthetic streamflow data using a Hidden Markov Model (HMM).
- Parameters:
flow_realizations_directory (str) – Full path to the directory containing the flow realization files for each sample. E.g., AnnualQ_s0.txt files produced by the ‘hmm_multisite_sample’ function.
iwr_multiplier (float) – Irrigation water requirement multiplier
sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
n_sites (int) – number of sites
n_years (int) – number of years
xbm_skip_rows (int) – number of rows to skip in XBM template file
iwr_skip_rows (int) – number of rows to skip in IWR template file
xbm_template_file (Union[None, str]) – Template file to build XBM adjustment off of
iwr_template_file (Union[None, str]) – Template file to build IWR adjustment off of
xbm_data_specification_file (Union[None, str]) – Specification YAML file for XBM format
iwr_data_specification_file (Union[None, str]) – Specification YAML file for XBM format
historical_column (int) – Index of year to use for historical data
months_in_year (int) – Number of months in year
seed_value (Union[None, int] = None)) – Integer to use for random seed or None if not desired
randomly_select_flow_sample (bool) – Choice to randomly select a realization sample from the flow files directory.
desired_sample_number (Union[None, int]) – If ‘randomly_select_flow_sample’ is set to False, select a desired sample number as an integer
statemodify.get_reservoir_structure_ids#
- statemodify.get_reservoir_structure_ids(basin_name, template_file=None, data_specification_file=None)[source]#
Generate a list of structure ids that are in the input file.
- Parameters:
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
- Returns:
List of structure ids
- Return type:
List
statemodify.modify_single_res#
- statemodify.modify_single_res(output_dir, scenario, basin_name, sample, sample_id=0, template_file=None, data_specification_file=None, target_structure_id_list=None, skip_rows=0)[source]#
Modify a single reservoir (.res) file based on a user provided sample.
- Parameters:
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
sample (np.array) – An array of samples for each parameter.
sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
target_structure_id_list (Union[None, List[str]]) – Structure id list to process. If None, all structure ids will be processed.
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
statemodify.modify_res#
- statemodify.modify_res(output_dir, scenario, basin_name='Gunnison', template_file=None, data_specification_file=None, target_structure_id_list=None, skip_rows=0, seed_value=None, n_jobs=-1, n_samples=1, save_sample=False)[source]#
Modify a single reservoir (.res) file based on a user provided sample.
- Parameters:
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
sample (np.array) – An array of samples for each parameter.
sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
data_specification_file (Union[None, str]) – If a full path to a data specification template is provided it will be used. Otherwise, the default file in the package is used.
target_structure_id_list (Union[None, List[str]]) – Structure id list to process. If None, all structure ids will be processed.
skip_rows (int, optional) – Number of rows to skip after the commented fields end; default 1
seed_value (Union[None, int] = None)) – Integer to use for random seed or None if not desired
n_jobs (int) – Number of jobs to process in parallel. Defaults to -1 meaning all but 1 processor.
n_samples (int) – Used if generate_samples is True. Number of samples to generate.
save_sample (bool) – Choice to save LHS sample or not; default False. If True, sample array will be written to the output directory.
- Example:
import statemodify as stm output_directory = "<your desired output directory>" scenario = "<your scenario name>" # basin name to process basin_name = "Gunnison" # seed value for reproducibility if so desired seed_value = 0 # number of jobs to launch in parallel; -1 is all but 1 processor used n_jobs = 2 # number of samples to generate n_samples = 2 stm.modify_res( output_dir=output_directory, scenario=scenario, basin_name=basin_name, target_structure_id_list=None, seed_value=seed_value, n_jobs=n_jobs, n_samples=n_samples, save_sample=False, )
HMM Functions#
statemodify.hmm_multisite_fit#
- statemodify.hmm_multisite_fit(n_basins=5, save_parameters=False, output_directory=None)[source]#
Fits a Hidden Markov Model (HMM) to multisite data.
- Parameters:
n_basins (int, optional) – The number of basins to fit the model to. Defaults to 5.
save_parameters (bool, optional) – If True, saves the model parameters to the specified output directory. Defaults to False.
output_directory (Union[None, str], optional) – The directory where model parameters will be saved if save_parameters is True. If None, parameters are not saved. Defaults to None.
statemodify.hmm_multisite_sample#
- statemodify.hmm_multisite_sample(logAnnualQ_h, transition_matrix, unconditional_dry, dry_state_means, wet_state_means, covariance_matrix_dry, covariance_matrix_wet, n_basins=5, n_alternatives=100, save_samples=True, output_directory=None)[source]#
Generate multisite samples for hydrological modeling using a Hidden Markov Model (HMM).
- Parameters:
logAnnualQ_h (ndarray) – Historical log space annual flows.
transition_matrix (ndarray) – The transition matrix of the HMM.
unconditional_dry (float) – Unconditional probability of the dry state.
dry_state_means (ndarray) – Means of the log space flows in the dry state.
wet_state_means (ndarray) – Means of the log space flows in the wet state.
covariance_matrix_dry (ndarray) – Covariance matrix for the dry state.
covariance_matrix_wet (ndarray) – Covariance matrix for the wet state.
n_basins (int) – Number of basins to simulate. Defaults to 5.
n_alternatives (int) – Number of alternative sequences to generate. Defaults to 100.
save_samples (bool) – Whether to save the generated samples. Defaults to True.
output_directory (Union[None, str]) – Directory where samples should be saved. Required if save_samples is True.
- Raises:
ValueError – If save_samples is True but output_directory is None.
statemodify.get_samples#
- statemodify.get_samples(param_dict, basin_name, n_samples=1, sampling_method='LHS', seed_value=None)[source]#
Generate or load Latin Hypercube Samples (LHS).
Currently, this reads in an input file of precalculated samples. This should happen on-demand and be given a seed value if reproducibility is needed.
- Parameters:
sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling
n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.
seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.
statemodify.generate_dry_state_means#
statemodify.generate_wet_state_means#
statemodify.generate_dry_covariance_matrix#
statemodify.generate_wet_covariance_matrix#
statemodify.generate_transition_matrix#
statemodify.calculate_array_monthly#
statemodify.calculate_array_annual#
statemodify.calculate_annual_sum#
statemodify.calculate_annual_mean_fractions#
statemodify.fit_iwr_model#
statemodify.generate_hmm_inputs#
statemodify.generate_flows#
- statemodify.generate_flows(dry_state_means, wet_state_means, covariance_matrix_dry, covariance_matrix_wet, transition_matrix, mu_0, sigma_0, mu_1, sigma_1, p00, p11, n_basins=5, n_years=105, seed_value=None)[source]#
Generate synthetic streamflow data using a Hidden Markov Model (HMM).
- Parameters:
dry_state_means (np.array) – mean streamflow values for dry state
wet_state_means (np.array) – mean streamflow values for wet state
covariance_matrix_dry (np.array) – covariance matrix for dry state
covariance_matrix_wet (np.array) – covariance matrix for wet state
transition_matrix (np.array) – transition matrix for HMM
mu_0 (float) – mean multiplier for dry state
sigma_0 (float) – covariance multiplier for dry state
mu_1 (float) – mean multiplier for wet state
sigma_1 (float) – covariance multiplier for wet state
p00 (float) – transition matrix multiplier for dry state
p11 (float) – transition matrix multiplier for wet state
n_basins (int) – number of sites to generate data for
n_years (int) – number of years to generate data for
seed_value (Union[None, int]) – random seed value
- Returns:
synthetic streamflow data
- Return type:
np.array
statemodify.generate_modified_file#
- statemodify.generate_modified_file(source_object, monthly_data_array, output_dir, scenario, sample_id=0)[source]#
Generate modified template data frame.
- Parameters:
source_object (object) – Instantiated object containing the source data and file specifics.
monthly_data_array (np.array) – Array of monthly data per year and site matching the shape of the input value columns.
sample_id (int) – Numeric ID of sample that is being processed. Defaults to 0.
output_dir (str) – Path to output directory.
scenario (str) – Scenario name.
Output Modification#
statemodify.convert_xdd#
- statemodify.convert_xdd(*, output_path='./output', allow_overwrite=False, xdd_files='**/*.xdd', id_subset=None, parallel_jobs=4, preserve_string_dtype=True)[source]#
Convert StateMod output .xdd files to compressed, columnar .parquet files.
Easily interoperate with pandas dataframes.
- Parameters:
output_path (str) – Path to a folder where outputs should be written; default “./output”
allow_overwrite (bool) – If False, abort if files already exist in the output_path; default False
xdd_files (List[str]) – File(s) or glob(s) to the .xdd files to convert; default “**/*.xdd”
id_subset (List[str]) – List of structure IDs to convert, or None for all; default None
parallel_jobs (int) – How many files to process in parallel; default 4
preserve_string_dtype (bool) – Keep string parsed data instead of casting to actual type; default True
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm stm.xdd.convert_xdd( # path to a directory where output .parquet files should be written output_path="./output", # whether to abort if .parquet files already exist at the output_path allow_overwrite=False, # path, glob, or a list of paths/globs to the .xdd files you want to convert xdd_files="**/*.xdd", # if the output .parquet files should only contain a subset of structure ids, list them here; None for all id_subset=None, # how many .xdd files to convert in paralllel; optimally you will want 2-4 CPUs per parallel process parallel_jobs=4, ) # look for your output .parquet files at the output_path!
statemodify.read_xre#
- statemodify.read_xre(xre_path, reservoir_name)[source]#
Read xre files generated by statemod using HMM data and historical data.
- Parameters:
- Returns:
a numpy array with reservoir storage across all hmm realizations, a numpy array with reservoir storage from the historical record a numpy array with monthly means of storage from the hist record a numpy array with monthly 1st percentiles of storage of the hist record
statemodify.extract_xre_data#
- statemodify.extract_xre_data(structure_name, structure_id, input_file=None, basin_name=None, write_csv=False, write_parquet=False, output_directory=None)[source]#
Extract a single reservoir from a raw .xre file in to a Pandas data frame.
Optionally save output to a CSV or Parquet file.
- Parameters:
structure_name (
str
) – str, name of the reservoirstructure_id (
str
) – str, structure ID for reservoir of interestinput_file (
Optional
[str
]) – Union[None, str], path to the xre filebasin_name (
Optional
[str
]) – Union[None, str], Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison Whitewrite_csv (
bool
) – bool, whether to write output to a CSV filewrite_parquet (
bool
) – bool, whether to write output to a Parquet fileoutput_directory (
Optional
[str
]) – Union[str, None], path to the output directory
- Return type:
DataFrame
- Returns:
pd.DataFrame, a Pandas data frame containing the extracted reservoir data
- Example:
..code-block::python
xre_file = ‘<path_to_file>/gm2015H.xre’ # path the the xre file structure_ID = ‘2803590’ # structure ID for reservoir of interest structure_name = ‘Blue_Mesa’ # name of the reservoir
- df = extract_xre_data(structure_name=structure_name,
structure_id=structure_ID, input_file=xre_file, output_directory=None, write_csv=None, write_parquet=None
)
Sampling#
statemodify.build_problem_dict#
- statemodify.build_problem_dict(modify_dict, fill=False)[source]#
Build the problem set from the input modify dictionary provided by the user.
- Parameters:
- Returns:
Dictionary ready for use by SALib sample generators
- Return type:
- Example:
import statemodify as stm # a dictionary to describe what you want to modify and the bounds for the LHS setup_dict = {"ids": ["10001", "10004"], "bounds": [-1.0, 1.0]} # generate problem dictionary to use with SALib sampling components problem_dict = stm.build_problem_dict(modify_dict, fill=False)
statemodify.generate_samples#
- statemodify.generate_samples(problem_dict, n_samples=1, sampling_method='LHS', seed_value=None)[source]#
Generate an array of statistically generated samples.
- Parameters:
problem_dict (Dict[str, List[Union[str, float]]]) – Dictionary of parameters to setup the sampler.
sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling
n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.
seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.
- Returns:
Array of samples
- Return type:
np.array
statemodify.validate_modify_dict#
statemodify.generate_sample_iwr#
- statemodify.generate_sample_iwr(n_samples=1, sampling_method='LHS', seed_value=None)[source]#
Generate samples for the IWR multiplier.
- Parameters:
sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling
n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.
seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.
statemodify.generate_sample_all_params#
- statemodify.generate_sample_all_params(n_samples=1, sampling_method='LHS', seed_value=None)[source]#
Generate samples for all parameters.
- Parameters:
sampling_method (str) – Sampling method. Uses SALib’s implementation (see https://salib.readthedocs.io/en/latest/). Currently supports the following method: “LHS” for Latin Hypercube Sampling
n_samples (int, optional) – Number of LHS samples to generate, optional. Defaults to 1.
seed_value (Union[None, int], optional) – Seed value to use when generating samples for the purpose of reproducibility. Defaults to None.
Modification#
statemodify.set_alignment#
statemodify.pad_with_spaces#
statemodify.add_zero_padding#
statemodify.populate_dict#
- statemodify.populate_dict(line, field_dict, column_widths, column_list, data_types, replace_dict)[source]#
Populate the input dictionary with values from each line based on column widths.
- Parameters:
line (str) – Line of data as a string from the input file.
field_dict (dict) – Dictionary holding values for each field.
column_widths (dict) – Dictionary of column names to expected widths.
column_list (list) – List of columns to process.
data_types (dict) – Dictionary of column names to data types.
replace_dict (dict) – Dictionary of value with replacement when necessary. For example, {”****”, “”}
- Return type:
- Returns:
Populated data dictionary.
statemodify.prep_data#
- statemodify.prep_data(field_dict, template_file, column_list, column_widths, data_types, comment='#', skip_rows=0, replace_dict={})[source]#
Ingest statemod file and format into a data frame.
- Parameters:
field_dict (dict) – Dictionary holding values for each field.
template_file (str) – Statemod input file to parse.
column_widths (dict) – Dictionary of column names to expected widths.
column_list (list) – List of columns to process.
data_types (dict) – Dictionary of column names to data types.
comment (str) – Characters leading string indicating ignoring a line.
skip_rows (int) – The number of uncommented rows of data to skip.
replace_dict (dict) – Dictionary of value with replacement when necessary. For example, {”****”, “”}
- Returns:
[0] data frame of data from file [1] header data from file
statemodify.construct_outfile_name#
statemodify.construct_data_string#
statemodify.apply_adjustment_factor#
- statemodify.apply_adjustment_factor(data_df, value_columns, query_field, target_ids, factor, factor_method='add')[source]#
Apply adjustment to template file values for target ids using a sample factor.
- Parameters:
data_df (pd.DataFrame) – Data frame of data content from file.
value_columns (list) – Value columns that may be modified.
query_field (str) – Field name to conduct queries for.
target_ids (list) – Ids associated in query field to modify.
factor (float) – Value to multiply the selected value columns by.
factor_method (str) – Method by which to apply the factor. Options ‘add’, ‘multiply’, ‘assign’. Defaults to ‘add’.
- Returns:
A data frame of modified values to replace the original data with.
- Return type:
pd.DataFrame
statemodify.validate_bounds#
- statemodify.validate_bounds(bounds_list, min_value=-0.5, max_value=1.0)[source]#
Ensure sample bounds provided by the user conform to a feasible range of values in feet per month.
- Parameters:
- Returns:
None
- Return type:
None
- Example:
import statemodify as stm # list of bounds to use for each parameter bounds_list = [[-0.5, 1.0], [-0.5, 1.0]] # validate bounds stm.validate_bounds(bounds_list=bounds_list, min_value=-0.5, max_value=1.0)
statemodify.Modify#
- statemodify.Modify(comment_indicator, data_dict, column_widths, column_alignment, data_types, column_list, value_columns)[source]#
Modification template for data transformation.
- Parameters:
comment_indicator (str) – Character(s) indicating the line in the file is a comment. Defualt “#”
data_dict (Dict[str, List[None]]) – Field specification in the file as a dictionary to hold the values for each field
column_widths (Dict[str, int]) – Column widths for the output file as a dictionary.
column_alignment (Dict[str, str]) – Expected column alignment.
data_types (Dict[str, type]) – Expected data types for each field
column_list (List[str]) – List of columns to process.
value_columns (List[str]) – List of columns that may be modified.
Batch Modification#
statemodify.get_required_arguments#
statemodify.get_arguments_values#
statemodify.generate_parameters#
statemodify.modify_batch#
- statemodify.modify_batch(problem_dict)[source]#
Run multiple modification functions from the same problem set and unified sample.
- Parameters:
problem_dict (dict) – Dictionary of options for running multiple functions in batch mode. Used to generate the sample and pass other options to participating functions. See the following example.
- Returns:
Dictionary of settings and samples for each participating funtion.
- Return type:
- Example:
import statemodify as stm # variables that apply to multiple functions output_dir = "<your_desired_directory>" basin_name = "Upper_Colorado" scenario = "1" seed_value = 77 # problem dictionary problem_dict = { "n_samples": 2, "num_vars": 3, "names": ["modify_eva", "modify_ddr", "modify_ddm"], "bounds": [[-0.5, 1.0], [0.5, 1.0], [0.5, 1.0]], # additional settings for each function "modify_eva": { "seed_value": seed_value, "output_dir": output_dir, "scenario": scenario, "basin_name": basin_name, "query_field": "id", "ids": ["10001", "10004"], }, "modify_ddr": { "seed_value": seed_value, "output_dir": output_dir, "scenario": scenario, "basin_name": basin_name, "query_field": "id", "ids": ["3600507.01", "3600507.02"], "admin": [None, 0], "on_off": [-1977, 1], }, "modify_ddm": { "seed_value": seed_value, "output_dir": output_dir, "scenario": scenario, "basin_name": basin_name, "query_field": "id", "ids": ["3600507", "3600603"], }, } # run in batch fn_parameter_dict = stm.modify_batch(problem_dict=problem_dict)
Visualization#
statemodify.plot_flow_duration_curves#
- statemodify.plot_flow_duration_curves(flow_realizations_directory, save_figure=False, output_directory=None, figure_name=None, dpi=300)[source]#
Plot flow duration curves for historical and synthetic data.
- Parameters:
flow_realizations_directory (str) – Full path to the directory containing the flow realization files for each sample. E.g., AnnualQ_s0.txt files produced by the ‘hmm_multisite_sample’ function.
statemodify.plot_res_quantiles#
- statemodify.plot_res_quantiles(hmm_data, historical_mean, reservoir_name, save_figure=False, output_directory=None, dpi=300)[source]#
Plot quantiles of the HMM data and the mean and 1st percentile of the historical record.
- Parameters:
hmm_data (
array
) – a numpy array with monthly storage data from HMM realizationshistorical_mean (
array
) – a numpy array with the historical mean monthly storageres_name – a string with the name of the reservoir
statemodify.plot_reservoir_boxes#
- statemodify.plot_reservoir_boxes(hmm_data, historical_data, reservoir_name, save_figure=False, output_directory=None, dpi=300)[source]#
Generate a boxplot comparing the historical record to the HMM realizations.
- Parameters:
hmm_data (
array
) – a numpy array with monthly storage data from HMM realizationshistorical_data (
array
) – a numpy array with monthly storage data from hist datareservoir_name (
str
) – a string with the name of the reservoir
Utilities#
statemodify.yaml_to_dict#
statemodify.select_template_file#
- statemodify.select_template_file(basin_name, template_file, extension=None)[source]#
Select either the default template file or a user provided one.
- Parameters:
basin_name (str) – Name of basin for either: Upper_Colorado Yampa San_Juan Gunnison White
template_file (Union[None, str]) – If a full path to a template file is provided it will be used. Otherwise the default template in this package will be used.
extension (Union[None, str]) – Extension of the target template file with no dot.
- Returns:
Template file path
- Return type: