Confidence Intervals ==================== Two-Sample Effect Sizes ----------------------- Researchers can use hierarch to compute confidence intervals for effect sizes. These intervals are computed via test inversion and, as a result, have the advantage of essentially always achieving the nominal coverage. To put it another way, hierarch computes a 95% confidence interval by performing a permutation test against the null hypothesis that true effect size is exactly equal to the observed effect size. Then, the bounds of the acceptance region at alpha = 0.05 are the bounds of the confidence interval. Let's consider the dataset from earlier. :: import pandas as pd import numpy as np import hierarch as ha data = pd.read_clipboard() print(data) +------------+------+-------------+----------+ | Condition | Well | Measurement | Values | +============+======+=============+==========+ | None | 1 | 1 | 5.202258 | +------------+------+-------------+----------+ | None | 1 | 2 | 5.136128 | +------------+------+-------------+----------+ | None | 1 | 3 | 5.231401 | +------------+------+-------------+----------+ | None | 2 | 1 | 5.336643 | +------------+------+-------------+----------+ | None | 2 | 2 | 5.287973 | +------------+------+-------------+----------+ | None | 2 | 3 | 5.375359 | +------------+------+-------------+----------+ | None | 3 | 1 | 5.350692 | +------------+------+-------------+----------+ | None | 3 | 2 | 5.465206 | +------------+------+-------------+----------+ | None | 3 | 3 | 5.422602 | +------------+------+-------------+----------+ | +Treatment | 4 | 1 | 5.695427 | +------------+------+-------------+----------+ | +Treatment | 4 | 2 | 5.668457 | +------------+------+-------------+----------+ | +Treatment | 4 | 3 | 5.752592 | +------------+------+-------------+----------+ | +Treatment | 5 | 1 | 5.583562 | +------------+------+-------------+----------+ | +Treatment | 5 | 2 | 5.647895 | +------------+------+-------------+----------+ | +Treatment | 5 | 3 | 5.618315 | +------------+------+-------------+----------+ | +Treatment | 6 | 1 | 5.642983 | +------------+------+-------------+----------+ | +Treatment | 6 | 2 | 5.47072 | +------------+------+-------------+----------+ | +Treatment | 6 | 3 | 5.686654 | +------------+------+-------------+----------+ You can use the confidence_interval function in hierarch.stats to compute the confidence interval. :: from hierarch.stats import confidence_interval confidence_interval( data, treatment_col=0, compare='means', interval=95, bootstraps=500, permutations="all", random_state=1, ) (-0.5373088054909549, -0.12010079984237881) This interval does not cross 0, so it is consistent with significance at the alpha = 0.05 level. Because ha.stats.confidence_interval is based on a hypothesis test, it requires the same input parameters as hypothesis_test. However, the new **interval** parameter determines the width of the interval. :: confidence_interval( data, treatment_col=0, compare='means', interval=99, bootstraps=500, permutations="all", random_state=1, ) (-0.9086402840632387, 0.25123067872990457) confidence_interval( data, treatment_col=0, compare='means', interval=68, bootstraps=500, permutations="all", random_state=1, ) (-0.40676489798778065, -0.25064470734555316) The 99% confidence interval does indeed cross 0, so we could not reject the null hypothesis at the alpha = 0.01 level. To build your confidence, you can perform a simulation analysis to ensure the confidence interval achieves the nominal coverage. You can set up a DataSimulator using the functions in hierarch.power as follows. :: from hierarch.power import DataSimulator parameters = [[0, 1.525], #difference in means due to treatment [stats.norm, 0, 1], #column 1 distribution - stats.norm(loc=0, scale=1) [stats.lognorm, 0.75]] #column 2 distribution - stats.lognorm(s = 0.75) sim = DataSimulator(parameters, random_state=1) import scipy.stats as stats hierarchy = [2, #treatments 3, #samples 3] #within-sample measurements sim.fit(hierarchy) The "true" difference between the two samples is 1.525 according to the simulation parameters, so 95% of 95% confidence intervals that hierarch calculates should contain this value. You can test this with the following code. :: true_difference = 1.525 coverage = 0 loops = 1000 for i in range(loops): data = sim.generate() lower, upper = confidence_interval(data, 0, interval=95, bootstraps=100, permutations='all') if lower <= true_difference <= upper: coverage += 1 print("Coverage:", coverage/loops) Coverage: 0.946 This is within the Monte Carlo error of the simulation (+/- 0.7%) of 95%, so we can feel confident in this method of interval computation. Regression Coefficient Confidence Intervals ------------------------------------------- The confidence_interval function can also be used on many-sample datasets that represent a hypothesized linear relationship. Let's generate a dataset with a "true" slope of 2/3. :: paramlist = [[0, 2/3, 4/3, 2], [stats.norm], [stats.norm]] hierarchy = [4, 2, 3] datagen = DataSimulator(paramlist, random_state=2) datagen.fit(hierarchy) data = datagen.generate() data +---+---+---+----------+ | 0 | 1 | 2 | 3 | +===+===+===+==========+ | 1 | 1 | 1 | 0.470264 | +---+---+---+----------+ | 1 | 1 | 2 | -0.36477 | +---+---+---+----------+ | 1 | 1 | 3 | 1.166621 | +---+---+---+----------+ | 1 | 2 | 1 | -0.8333 | +---+---+---+----------+ | 1 | 2 | 2 | -0.85157 | +---+---+---+----------+ | 1 | 2 | 3 | -1.3149 | +---+---+---+----------+ | 2 | 1 | 1 | 0.708561 | +---+---+---+----------+ | 2 | 1 | 2 | 0.154405 | +---+---+---+----------+ | 2 | 1 | 3 | 0.798892 | +---+---+---+----------+ | 2 | 2 | 1 | -2.38199 | +---+---+---+----------+ | 2 | 2 | 2 | -1.64797 | +---+---+---+----------+ | 2 | 2 | 3 | -2.66707 | +---+---+---+----------+ | 3 | 1 | 1 | 3.974506 | +---+---+---+----------+ | 3 | 1 | 2 | 3.321076 | +---+---+---+----------+ | 3 | 1 | 3 | 3.463612 | +---+---+---+----------+ | 3 | 2 | 1 | 2.888003 | +---+---+---+----------+ | 3 | 2 | 2 | 1.466742 | +---+---+---+----------+ | 3 | 2 | 3 | 3.26068 | +---+---+---+----------+ | 4 | 1 | 1 | 3.73128 | +---+---+---+----------+ | 4 | 1 | 2 | 0.036135 | +---+---+---+----------+ | 4 | 1 | 3 | -0.05483 | +---+---+---+----------+ | 4 | 2 | 1 | 1.268975 | +---+---+---+----------+ | 4 | 2 | 2 | 3.615265 | +---+---+---+----------+ | 4 | 2 | 3 | 2.902522 | +---+---+---+----------+ You can compute a confidence interval in the same manner as above. This time, you should set the **compare** keyword argument to "corr" for clarity, but "corr" is also the default setting for **compare** when computing a confidence interval. :: from hierarch.stats import confidence_interval confidence_interval( data, treatment_col=0, compare='corr', interval=95, bootstraps=500, permutations="all", random_state=1, ) (0.3410887712843298, 1.7540918236455125) This confidence interval corresponds to the slope in a linear model. You can check this by computing the slope coefficient via Ordinary Least Squares. :: import scipy.stats from hierarch.internal_functions import GroupbyMean grouper = GroupbyMean() test = grouper.fit_transform(data) stats.linregress(test[:,0], test[:,-1]) LinregressResult(slope=1.0515132531203024, intercept=-1.6658194480556106, rvalue=0.6444075548383587, pvalue=0.08456152533094284, stderr=0.5094006523081002, intercept_stderr=1.3950511403849626) The slope, 1.0515, is indeed in the center of our computed interval (within Monte Carlo error). Again, it is worthwhile to check that confidence_interval is performing adequately. You can set up a simulation as above to check the coverage of the 95% confidence interval. :: true_difference = 2/3 coverage = 0 loops = 1000 for i in range(loops): data = datagen.generate() lower, upper = confidence_interval(data, 0, interval=95, bootstraps=100, permutations='all') if lower <= true_difference <= upper: coverage += 1 print(coverage/loops) 0.956 This is within the Monte Carlo error of the simulation (+/- 0.7%) of 95% and therefore acceptable. You can check the coverage of other intervals by changing the **interval** keyword argument, though be aware that Monte Carlo error depends on the probability of the event of interest. :: true_difference = 2/3 coverage = 0 loops = 1000 for i in range(loops): data = datagen.generate() lower, upper = confidence_interval(data, 0, interval=99, bootstraps=100, permutations='all') if lower <= true_difference <= upper: coverage += 1 print(coverage/loops) 0.99 Using the confidence_interval function, researchers can rapidly calculate confidence intervals for effect sizes that maintain nominal coverage without worrying about distributional assumptions.