hierarch.power.DataSimulator

class hierarch.power.DataSimulator(paramlist, random_state=None)

Bases: object

Class for simulating data for a power analysis.

Parameters:
paramlistlist of lists of parameters

See notes.

random_stateint or numpy.random.Generator instance, optional

Seedable for reproducibility, by default None

Examples

Each sublist in paramlist can either be an integer or a scipy.stats random distribution generator. The following lines illustrate a few ways of specifying the same parameters (no treatment effect, both columns are randomly generated Gaussian variables).

>>> paramlist = [[0, 0], [[stats.norm]]*6, [stats.norm, 0, 1]]
>>> paramlist = [[0]*2, [stats.norm], [stats.norm]]

Methods

fit(hierarchy)

Fit the DataSimulator to a hierarchy.

generate()

Generate a simulated dataset based on specified parameters and hierarchy.

fit(hierarchy)

Fit the DataSimulator to a hierarchy.

Parameters:
hierarchylist of ints, optional

number of clusters in each column, by default []

Examples

This creates a data container with 2 clusters in column 0, 3 clusters each in column 1 (for 6 total), and 3 clusters each in column 2 (18 total).

>>> import scipy.stats as stats
>>> paramlist = [[0, 0], [[stats.norm]]*8, [stats.norm, 0, 1]]
>>> datagen = DataSimulator(paramlist)
>>> hierarchy = [2, 3, 3]
>>> datagen.fit(hierarchy)
>>> datagen.container
array([[1., 1., 1., 0.],
       [1., 1., 2., 0.],
       [1., 1., 3., 0.],
       [1., 2., 1., 0.],
       [1., 2., 2., 0.],
       [1., 2., 3., 0.],
       [1., 3., 1., 0.],
       [1., 3., 2., 0.],
       [1., 3., 3., 0.],
       [2., 1., 1., 0.],
       [2., 1., 2., 0.],
       [2., 1., 3., 0.],
       [2., 2., 1., 0.],
       [2., 2., 2., 0.],
       [2., 2., 3., 0.],
       [2., 3., 1., 0.],
       [2., 3., 2., 0.],
       [2., 3., 3., 0.]])
generate()

Generate a simulated dataset based on specified parameters and hierarchy.

Returns:
2D numeric

Simulated data.

Examples

>>> paramlist = [[0, 0], [stats.norm], [stats.norm]]
>>> datagen = DataSimulator(paramlist, random_state=1)
>>> hierarchy = [2, 3, 3]
>>> datagen.fit(hierarchy)
>>> datagen.generate()
array([[ 1.        ,  1.        ,  1.        , -0.19136904],
       [ 1.        ,  1.        ,  2.        ,  0.9267023 ],
       [ 1.        ,  1.        ,  3.        ,  0.71015659],
       [ 1.        ,  2.        ,  1.        ,  1.11575064],
       [ 1.        ,  2.        ,  2.        ,  0.85004038],
       [ 1.        ,  2.        ,  3.        ,  1.36833113],
       [ 1.        ,  3.        ,  1.        , -0.40601701],
       [ 1.        ,  3.        ,  2.        ,  0.16752713],
       [ 1.        ,  3.        ,  3.        , -0.15168224],
       [ 2.        ,  1.        ,  1.        , -0.70431102],
       [ 2.        ,  1.        ,  2.        , -1.26343512],
       [ 2.        ,  1.        ,  3.        , -1.59561398],
       [ 2.        ,  2.        ,  1.        ,  0.1234474 ],
       [ 2.        ,  2.        ,  2.        ,  0.64816363],
       [ 2.        ,  2.        ,  3.        ,  0.91349805],
       [ 2.        ,  3.        ,  1.        ,  0.17077167],
       [ 2.        ,  3.        ,  2.        ,  1.74043839],
       [ 2.        ,  3.        ,  3.        ,  1.45309889]])