smash 0.4.0 Release Notes#
The smash 0.4.0 release continues the ongoing work to improve the handling, fix possible bugs, clarify the documentation. The highlights are:
Regularization with full distributed mapping
Multiple forward run in parallel
Addition of many user guides on the different optimization methods
Improved handling of sample generation results
Contributors#
This release was made possible thanks to the contributions of:
Ngo Nghi Truyen Huynh
François Colleoni
Maxime Jay-Allemand
Deprecations#
Makefile command#
The baseline_test
makefile command has been deprecated and replaced by test_baseline
.
Regularization parameter name#
The name of the regularization parameter in the Model.bayes_estimate()
and Model.bayes_optimize()
methods has been deprecated and changed from k to alpha.
It can be used as follows:
>>> model.bayes_optimize(alpha=2)
instead of:
>>> model.bayes_optimize(k=2)
BayesResult object#
The attribute l_curve
of smash.BayesResult
object has been deprecated and replaced by lcurve
.
The key Mahalanobis_distance
has also been changed to a shortened name mahal_dist
.
It can be used as follows:
>>> br = model.bayes_estimate(alpha=range(5), inplace=True, return_br=True)
>>> br.lcurve["mahal_dist"]
instead of:
>>> br.l_curve["Mahalanobis_distance"]
Sample generator#
The argument backg_sol in the smash.generate_samples
has been deprecated and replaced by mean.
Note that mean is now a dictionary, whereas backg_sol used to be a 1D array-like.
It can be used as follows:
>>> sr = smash.generate_samples(problem, generator="normal", mean={"cp": 500, "cft": 200})
instead of:
>>> sr = smash.generate_samples(problem, generator="normal", backg_sol=[500, 200])
Improvements#
Return of generated samples#
The smash.generate_samples()
method now returns an instance of the smash.SampleResult
object instead of a pandas.DataFrame.
It can be used as follows:
>>> problem = {'num_vars': 1, 'names': ['cp'], 'bounds': [[1,2000]]}
>>> sr = smash.generate_samples(problem)
>>> sr.cp
Bayesian optimization#
The Model.bayes_estimate()
and Model.bayes_optimize()
methods now allow you to define an instance of
the smash.SampleResult
object for generating samples. As a result, we have removed all arguments related to sample generation from both methods.
It can be use as follows:
>>> problem = {'num_vars': 1, 'names': ['cp'], 'bounds': [[1,2000]]}
>>> sr = smash.generate_samples(problem)
>>> model.bayes_estimate(sample=sr)
Pipeline stage#
The pipeline stage build-tap
has been renamed to tap-cmp
and updated allowing a comparison between the source tapenade file and the new regenerated one.
If an error occurs during this stage, it means that the source tapenade file has not been regenerated.
Documentation#
Add the user guide for advanced optimization techniques.
Add developers guide, list of contributors and license to the documentation.
New Features#
Conversion of Result objects#
We have added additional methods to some Result objects, which are:
PrcpIndicesResult.to_numpy()
for thePrcpIndicesResult
object.SampleResult.to_numpy()
andSampleResult.to_dataframe()
for theSampleResult
object.
It can be used as follows:
>>> problem = {'num_vars': 1, 'names': ['cp'], 'bounds': [[1,2000]]}
>>> sr = smash.generate_samples(problem) # create a SampleResult object
>>> sr.to_numpy() # convert to numpy array
>>> sr.to_dataframe() # convert to pandas dataframe
Slice and iterate over the SampleResult object#
We have added two additional methods to the SampleResult
object, which are:
SampleResult.slice()
SampleResult.iterslice()
It can be used as follows:
>>> problem = {'num_vars': 1, 'names': ['cp'], 'bounds': [[1,2000]]}
>>> sr = smash.generate_samples(problem) # create a SampleResult object
>>> slc = sr.slice(10) # slice the first 10 sets
>>> slc = sr.slice(start=20, end=50) # slice between the 20th and 50th set
>>> for slc_i in sr.iterslice(100): # iterate on sub sample of 100 sets
>>> slc_i
Regularization with full distributed mapping#
The regularization terms have been added for the optimization with a distributed mapping.
Two types of regularization function are considered, which are prior
and smoothing
.
Hint
See a detailed explanation on the regularization function in the Math / Num section.
It can be used as follows:
>>> model.optimize(mapping="distributed", options={"jreg_fun": "smoothing"})
Model Multiple Run#
We have added a new method to the smash.Model
object Model.multiple_run()
. This method allows to compute multiple forward runs in parallel based on a sample generated
with the smash.generate_samples()
method.
It can be used as follows:
>>> setup, mesh = smash.load_dataset("cance")
>>> model = smash.Model(setup, mesh)
>>> problem = model.get_bound_constraints()
>>> sample = smash.generate_samples(problem, n=200, random_state=99)
>>> mtprr = model.multiple_run(sample, ncpu=4, return_qsim=True)
>>> mtprr.cost # access the cost values
>>> mtprr.qsim # access the simulated discharge values if return_qsim is True
This method also accepts the cost function arguments that are used in the Model.optimize()
method (i.e. jobs_fun, wjobs_fun etc)
>>> mtprr = model.multiple_run(sample, jobs_fun="kge", gauge="all", ncpu=4, return_qsim=True)
Makefile command#
Three new makefile commands are available:
tap_cmp
: compare source tapenade file with new regenerated one,doc
: generate sphinx documentation,doc_clean
: clean sphinx documentation.
Fixes#
Fix an issue where passing an unknown key in the options arguments in the Model.optimize()
, Net.add()
,
Net.compile()
methods, and event_seg argument in the Model.optimize()
method, would result in a warning.
The warning has been replaced with a KeyError to provide clearer feedback when typing a key that does not exist.
For example:
>>> model.optimize(options={"unknown_key": 1})
resulting an error:
KeyError: "Unknown algorithm options: 'unknown_key'"