smash.factory.generate_samples#

smash.factory.generate_samples(problem, generator='uniform', n=1000, random_state=None, mean=None, coef_std=None)[source]#

Generate a multiple set of variables.

Parameters:
problemdict[str, Any]

Problem definition. The keys are

  • 'num_vars' : the number of variables.

  • 'names' : the name of the variables.

  • 'bounds' : the upper and lower bounds of each variable (a sequence of (min, max)).

generatorstr, default ‘uniform’

Samples generator. Should be one of

  • 'uniform'

  • 'normal' (or 'gaussian')

nint, default 1000

Number of generated samples.

random_stateint or None, default None

Random seed used to generate samples.

Note

If not given, generates parameters sets with a random seed.

meandict[str, float] or None, default None

If the samples are generated using a Gaussian distribution (i.e. 'normal' or 'gaussian' in generator), mean is used to define the mean of the distribution for each variable. It is a dictionary where keys are the name of the rainfall-runoff parameters and/or initial states defined in the problem argument. In this case, the truncated normal distribution may be used with respect to the boundary conditions defined in problem. None value inside the dictionary will be filled in with the center of the rainfall-runoff parameter and/or initial state bounds.

Note

If not given and Gaussian distribution is used, the mean of the distribution will be set to the center of the variable bounds.

coef_stdfloat or None, default None

A coefficient related to the standard deviation in case of Gaussian generator:

\[std = \frac{u - l}{coef\_std}\]

where \(u\) and \(l\) are the upper and lower bounds of variables.

Note

If not given and Gaussian distribution is used, coef_std is set to 3 as default:

\[std = \frac{u - l}{3}\]
Returns:
samplesSamples

It returns an object containing the generated samples result.

See also

smash.Samples

Represents the generated samples result.

Examples

>>> from smash.factory import generate_samples

Define the problem by a dictionary

>>> problem = {
    'num_vars': 4,
    'names': ['cp', 'ct', 'kexc', 'llr'],
    'bounds': [[1, 2000], [1, 1000], [-20, 5], [1, 1000]]
}

Generate samples

>>> sr = generate_samples(problem, n=3, random_state=99)

Convert sample to pandas.DataFrame

>>> sr.to_dataframe()
            cp          ct       kexc         llr
0  1344.884839   32.414941 -12.559438    7.818907
1   976.668720  808.241913 -18.832607  770.023235
2  1651.164853  566.051802   4.765685  747.020334