hierarch.stats.multi_sample_test

hierarch.stats.multi_sample_test(data_array, treatment_col, hypotheses='all', correction='fdr', compare='means', skip=None, bootstraps=100, permutations=1000, kind='weights', random_state=None)

Two-tailed multiple-sample hierarchical permutation test.

Equivalent to a post-hoc test after ANOVA. Results are more interpetable when the input data is in the form of a pandas dataframe or numpy object array.

Parameters:
data_array2D array or pandas DataFrame

Array-like containing both the independent and dependent variables to be analyzed. It’s assumed that the final (rightmost) column contains the dependent variable values.

treatment_colint or str

The index number of the column containing labels to be compared. Indexing starts at 0. If input data is a pandas DataFrame, this can be the column name.

hypotheseslist of two-element lists or “all”, optional

Hypotheses to be tested. If ‘all’ every pairwise comparison will be tested. Can be passed a list of lists to restrict comparisons, which will result in a less harsh multiple comparisons correction, by default “all”

correctionstr, optional

Multiple comparisons question to be performed after p-values are calculated. ‘fdr’ performs the Benjamini-Hochberg procedure for controlling False Discovery Rate, by default “fdr”

comparefunction or str, optional

The comparison to use to perform the hypothesis test, by default “means”

skiplist of ints, optional

Columns to skip in the bootstrap, by default None

bootstrapsint, optional

Number of bootstraps to perform, by default 100

permutationsint or “all”

Number of permutations to perform PER bootstrap sample. “all” for exact test, by default 1000

kindstr, optional

Bootstrapper algorithm. See Bootstrapper class, by default “weights”

seedint or numpy.random.Generator instance, optional

Seedable for reproducibility, by default None

Returns:
ndarray

numpy ndarray with col 0, 1 corresponding to treatment labels, col 2 corresponding to an uncorrected p-value, and col 3 corresponding to a corrected p-value if a correction was specified.

Raises:
KeyError

Raised if passed correction is not valid.

TypeError

Raised if input data is not ndarray or DataFrame.

KeyError

Raised if specified comparison labels do not exist in the input data.

Examples

This function performs pairwise tests akin to a post-hoc test after one-way ANOVA.

>>> from hierarch.power import DataSimulator
>>> import scipy.stats as stats
>>> paramlist = [[0, 1, 4, 0], [stats.norm], [stats.norm]]
>>> hierarchy = [4, 3, 3]

This dataset has four treatment conditions, two of which have the same mean (condition 1 and 4). Condition 2 has a slight mean difference from 1 and 4, so this experiment is likely not well-powered to detect it. Condition 3 has a large mean difference from the others, however, and should return a significant result against all three other conditions.

>>> datagen = DataSimulator(paramlist, random_state=1)
>>> datagen.fit(hierarchy)
>>> data = datagen.generate()
>>> data
array([[ 1.        ,  1.        ,  1.        , -0.39086989],
       [ 1.        ,  1.        ,  2.        ,  0.18267424],
       [ 1.        ,  1.        ,  3.        , -0.13653512],
       [ 1.        ,  2.        ,  1.        ,  1.42046436],
       [ 1.        ,  2.        ,  2.        ,  0.86134025],
       [ 1.        ,  2.        ,  3.        ,  0.52916139],
       [ 1.        ,  3.        ,  1.        , -0.45147139],
       [ 1.        ,  3.        ,  2.        ,  0.07324484],
       [ 1.        ,  3.        ,  3.        ,  0.33857926],
       [ 2.        ,  1.        ,  1.        , -0.57876014],
       [ 2.        ,  1.        ,  2.        ,  0.99090658],
       [ 2.        ,  1.        ,  3.        ,  0.70356708],
       [ 2.        ,  2.        ,  1.        , -0.80580661],
       [ 2.        ,  2.        ,  2.        ,  0.01634262],
       [ 2.        ,  2.        ,  3.        ,  1.73058377],
       [ 2.        ,  3.        ,  1.        ,  1.02418416],
       [ 2.        ,  3.        ,  2.        ,  1.66001757],
       [ 2.        ,  3.        ,  3.        ,  1.6636965 ],
       [ 3.        ,  1.        ,  1.        ,  5.58088552],
       [ 3.        ,  1.        ,  2.        ,  2.351026  ],
       [ 3.        ,  1.        ,  3.        ,  3.08544176],
       [ 3.        ,  2.        ,  1.        ,  6.62388971],
       [ 3.        ,  2.        ,  2.        ,  5.2278211 ],
       [ 3.        ,  2.        ,  3.        ,  5.24418148],
       [ 3.        ,  3.        ,  1.        ,  3.85056602],
       [ 3.        ,  3.        ,  2.        ,  2.71649723],
       [ 3.        ,  3.        ,  3.        ,  4.53203714],
       [ 4.        ,  1.        ,  1.        ,  0.40314658],
       [ 4.        ,  1.        ,  2.        , -0.93321956],
       [ 4.        ,  1.        ,  3.        , -0.38909417],
       [ 4.        ,  2.        ,  1.        , -0.04362144],
       [ 4.        ,  2.        ,  2.        , -0.91632938],
       [ 4.        ,  2.        ,  3.        , -0.06984773],
       [ 4.        ,  3.        ,  1.        ,  0.64219601],
       [ 4.        ,  3.        ,  2.        ,  0.58229922],
       [ 4.        ,  3.        ,  3.        ,  0.04042133]])

There are six total comparisons that can be made. Condition 1 and 2 are in the first two columns and the p-values are in the final column.

>>> multi_sample_test(data, treatment_col=0, hypotheses="all",
...                   correction=None, bootstraps=1000,
...                   permutations="all", random_state=111)
  Condition 1 Condition 2 p-value
0         2.0         3.0  0.0355
1         1.0         3.0  0.0394
2         3.0         4.0  0.0407
3         2.0         4.0  0.1477
4         1.0         2.0  0.4022
5         1.0         4.0  0.4559

Multiple comparison correction to control False Discovery Rate is advisable in this situation. The final column now shows the q-values, or “adjusted” p-values following the Benjamini-Hochberg procedure.

>>> multi_sample_test(data, treatment_col=0, hypotheses="all",
...                   correction='fdr', bootstraps=1000,
...                   permutations="all", random_state=111)
  Condition 1 Condition 2 p-value Corrected p-value
0         2.0         3.0  0.0355            0.0814
1         1.0         3.0  0.0394            0.0814
2         3.0         4.0  0.0407            0.0814
3         2.0         4.0  0.1477           0.22155
4         1.0         2.0  0.4022            0.4559
5         1.0         4.0  0.4559            0.4559

Perhaps the experimenter is not interested in every pairwise comparison - perhaps condition 2 is a control that all other conditions are meant to be compared to. The comparisons of interest can be specified using a list.

>>> tests = [[2.0, 1.0], [2.0, 3.0], [2.0, 4.0]]
>>> multi_sample_test(data, treatment_col=0, hypotheses=tests,
...                   correction='fdr', bootstraps=1000,
...                   permutations="all", random_state=222)
  Condition 1 Condition 2 p-value Corrected p-value
0         2.0         3.0   0.036             0.108
1         2.0         4.0  0.1506            0.2259
2         2.0         1.0  0.4036            0.4036