hierarch.stats.multi_sample_test

hierarch.stats.multi_sample_test(data_array, treatment_col, hypotheses='all', correction='fdr', compare='means', skip=None, bootstraps=100, permutations=1000, kind='weights', random_state=None)

Two-tailed multiple-sample hierarchical permutation test.

Equivalent to a post-hoc test after ANOVA. Results are more interpetable when the input data is in the form of a pandas dataframe or numpy object array.

Parameters:

data_array2D array or pandas DataFrame: Array-like containing both the independent and dependent variables to be analyzed. It’s assumed that the final (rightmost) column contains the dependent variable values.
treatment_colint or str: The index number of the column containing labels to be compared. Indexing starts at 0. If input data is a pandas DataFrame, this can be the column name.
hypotheseslist of two-element lists or “all”, optional: Hypotheses to be tested. If ‘all’ every pairwise comparison will be tested. Can be passed a list of lists to restrict comparisons, which will result in a less harsh multiple comparisons correction, by default “all”
correctionstr, optional: Multiple comparisons question to be performed after p-values are calculated. ‘fdr’ performs the Benjamini-Hochberg procedure for controlling False Discovery Rate, by default “fdr”
comparefunction or str, optional: The comparison to use to perform the hypothesis test, by default “means”
skiplist of ints, optional: Columns to skip in the bootstrap, by default None
bootstrapsint, optional: Number of bootstraps to perform, by default 100
permutationsint or “all”: Number of permutations to perform PER bootstrap sample. “all” for exact test, by default 1000
kindstr, optional: Bootstrapper algorithm. See Bootstrapper class, by default “weights”
seedint or numpy.random.Generator instance, optional: Seedable for reproducibility, by default None

Returns:

ndarray: numpy ndarray with col 0, 1 corresponding to treatment labels, col 2 corresponding to an uncorrected p-value, and col 3 corresponding to a corrected p-value if a correction was specified.

Raises:

KeyError: Raised if passed correction is not valid.
TypeError: Raised if input data is not ndarray or DataFrame.
KeyError: Raised if specified comparison labels do not exist in the input data.

Examples

This function performs pairwise tests akin to a post-hoc test after one-way ANOVA.

>>> from hierarch.power import DataSimulator
>>> import scipy.stats as stats
>>> paramlist = [[0, 1, 4, 0], [stats.norm], [stats.norm]]
>>> hierarchy = [4, 3, 3]

This dataset has four treatment conditions, two of which have the same mean (condition 1 and 4). Condition 2 has a slight mean difference from 1 and 4, so this experiment is likely not well-powered to detect it. Condition 3 has a large mean difference from the others, however, and should return a significant result against all three other conditions.

>>> datagen = DataSimulator(paramlist, random_state=1)
>>> datagen.fit(hierarchy)
>>> data = datagen.generate()
>>> data
array([[ 1.        ,  1.        ,  1.        , -0.39086989],
       [ 1.        ,  1.        ,  2.        ,  0.18267424],
       [ 1.        ,  1.        ,  3.        , -0.13653512],
       [ 1.        ,  2.        ,  1.        ,  1.42046436],
       [ 1.        ,  2.        ,  2.        ,  0.86134025],
       [ 1.        ,  2.        ,  3.        ,  0.52916139],
       [ 1.        ,  3.        ,  1.        , -0.45147139],
       [ 1.        ,  3.        ,  2.        ,  0.07324484],
       [ 1.        ,  3.        ,  3.        ,  0.33857926],
       [ 2.        ,  1.        ,  1.        , -0.57876014],
       [ 2.        ,  1.        ,  2.        ,  0.99090658],
       [ 2.        ,  1.        ,  3.        ,  0.70356708],
       [ 2.        ,  2.        ,  1.        , -0.80580661],
       [ 2.        ,  2.        ,  2.        ,  0.01634262],
       [ 2.        ,  2.        ,  3.        ,  1.73058377],
       [ 2.        ,  3.        ,  1.        ,  1.02418416],
       [ 2.        ,  3.        ,  2.        ,  1.66001757],
       [ 2.        ,  3.        ,  3.        ,  1.6636965 ],
       [ 3.        ,  1.        ,  1.        ,  5.58088552],
       [ 3.        ,  1.        ,  2.        ,  2.351026  ],
       [ 3.        ,  1.        ,  3.        ,  3.08544176],
       [ 3.        ,  2.        ,  1.        ,  6.62388971],
       [ 3.        ,  2.        ,  2.        ,  5.2278211 ],
       [ 3.        ,  2.        ,  3.        ,  5.24418148],
       [ 3.        ,  3.        ,  1.        ,  3.85056602],
       [ 3.        ,  3.        ,  2.        ,  2.71649723],
       [ 3.        ,  3.        ,  3.        ,  4.53203714],
       [ 4.        ,  1.        ,  1.        ,  0.40314658],
       [ 4.        ,  1.        ,  2.        , -0.93321956],
       [ 4.        ,  1.        ,  3.        , -0.38909417],
       [ 4.        ,  2.        ,  1.        , -0.04362144],
       [ 4.        ,  2.        ,  2.        , -0.91632938],
       [ 4.        ,  2.        ,  3.        , -0.06984773],
       [ 4.        ,  3.        ,  1.        ,  0.64219601],
       [ 4.        ,  3.        ,  2.        ,  0.58229922],
       [ 4.        ,  3.        ,  3.        ,  0.04042133]])

There are six total comparisons that can be made. Condition 1 and 2 are in the first two columns and the p-values are in the final column.

>>> multi_sample_test(data, treatment_col=0, hypotheses="all",
...                   correction=None, bootstraps=1000,
...                   permutations="all", random_state=111)
  Condition 1 Condition 2 p-value
0         2.0         3.0  0.0355
1         1.0         3.0  0.0394
2         3.0         4.0  0.0407
3         2.0         4.0  0.1477
4         1.0         2.0  0.4022
5         1.0         4.0  0.4559

Multiple comparison correction to control False Discovery Rate is advisable in this situation. The final column now shows the q-values, or “adjusted” p-values following the Benjamini-Hochberg procedure.

>>> multi_sample_test(data, treatment_col=0, hypotheses="all",
...                   correction='fdr', bootstraps=1000,
...                   permutations="all", random_state=111)
  Condition 1 Condition 2 p-value Corrected p-value
0         2.0         3.0  0.0355            0.0814
1         1.0         3.0  0.0394            0.0814
2         3.0         4.0  0.0407            0.0814
3         2.0         4.0  0.1477           0.22155
4         1.0         2.0  0.4022            0.4559
5         1.0         4.0  0.4559            0.4559

Perhaps the experimenter is not interested in every pairwise comparison - perhaps condition 2 is a control that all other conditions are meant to be compared to. The comparisons of interest can be specified using a list.

>>> tests = [[2.0, 1.0], [2.0, 3.0], [2.0, 4.0]]
>>> multi_sample_test(data, treatment_col=0, hypotheses=tests,
...                   correction='fdr', bootstraps=1000,
...                   permutations="all", random_state=222)
  Condition 1 Condition 2 p-value Corrected p-value
0         2.0         3.0   0.036             0.108
1         2.0         4.0  0.1506            0.2259
2         2.0         1.0  0.4036            0.4036