Confidence Intervals
====================

Two-Sample Effect Sizes 
-----------------------
Researchers can use hierarch to compute confidence intervals for effect sizes. 
These intervals are computed via test inversion and, as a result, have the advantage
of essentially always achieving the nominal coverage. 

To put it another way, hierarch computes a 95% confidence interval by performing a 
permutation test against the null hypothesis that true effect size is exactly equal 
to the observed effect size. Then, the bounds of the acceptance region at alpha = 0.05
are the bounds of the confidence interval. Let's consider the dataset from earlier. ::

    import pandas as pd
    import numpy as np
    import hierarch as ha

    data = pd.read_clipboard()

    print(data)

+------------+------+-------------+----------+
|  Condition | Well | Measurement |  Values  |
+============+======+=============+==========+
|    None    |   1  |      1      | 5.202258 |
+------------+------+-------------+----------+
|    None    |   1  |      2      | 5.136128 |
+------------+------+-------------+----------+
|    None    |   1  |      3      | 5.231401 |
+------------+------+-------------+----------+
|    None    |   2  |      1      | 5.336643 |
+------------+------+-------------+----------+
|    None    |   2  |      2      | 5.287973 |
+------------+------+-------------+----------+
|    None    |   2  |      3      | 5.375359 |
+------------+------+-------------+----------+
|    None    |   3  |      1      | 5.350692 |
+------------+------+-------------+----------+
|    None    |   3  |      2      | 5.465206 |
+------------+------+-------------+----------+
|    None    |   3  |      3      | 5.422602 |
+------------+------+-------------+----------+
| +Treatment |   4  |      1      | 5.695427 |
+------------+------+-------------+----------+
| +Treatment |   4  |      2      | 5.668457 |
+------------+------+-------------+----------+
| +Treatment |   4  |      3      | 5.752592 |
+------------+------+-------------+----------+
| +Treatment |   5  |      1      | 5.583562 |
+------------+------+-------------+----------+
| +Treatment |   5  |      2      | 5.647895 |
+------------+------+-------------+----------+
| +Treatment |   5  |      3      | 5.618315 |
+------------+------+-------------+----------+
| +Treatment |   6  |      1      | 5.642983 |
+------------+------+-------------+----------+
| +Treatment |   6  |      2      |  5.47072 |
+------------+------+-------------+----------+
| +Treatment |   6  |      3      | 5.686654 |
+------------+------+-------------+----------+

You can use the confidence_interval function in hierarch.stats to compute the 
confidence interval. ::

    from hierarch.stats import confidence_interval

    confidence_interval(
    data,
    treatment_col=0,
    compare='means',
    interval=95,
    bootstraps=500,
    permutations="all",
    random_state=1,
    )

    (-0.5373088054909549, -0.12010079984237881)

This interval does not cross 0, so it is consistent with significance at the alpha = 0.05
level.

Because ha.stats.confidence_interval is based on a hypothesis test, it requires
the same input parameters as hypothesis_test. However, 
the new **interval** parameter determines the width of the interval. ::

    confidence_interval(
    data,
    treatment_col=0,
    compare='means',
    interval=99,
    bootstraps=500,
    permutations="all",
    random_state=1,
    )

    (-0.9086402840632387, 0.25123067872990457)

    confidence_interval(
    data,
    treatment_col=0,
    compare='means',
    interval=68,
    bootstraps=500,
    permutations="all",
    random_state=1,
    )

    (-0.40676489798778065, -0.25064470734555316)

The 99% confidence interval does indeed cross 0, so we could not reject the null hypothesis
at the alpha = 0.01 level.

To build your confidence, you can perform a simulation analysis to ensure 
the confidence interval achieves the nominal coverage. You can set up a 
DataSimulator using the functions in hierarch.power as follows. ::

    from hierarch.power import DataSimulator

    parameters = [[0, 1.525], #difference in means due to treatment
                [stats.norm, 0, 1], #column 1 distribution - stats.norm(loc=0, scale=1)
                [stats.lognorm, 0.75]] #column 2 distribution - stats.lognorm(s = 0.75)

    sim = DataSimulator(parameters, random_state=1)

    import scipy.stats as stats

    hierarchy = [2, #treatments
                3, #samples
                3] #within-sample measurements

    sim.fit(hierarchy)

The "true" difference between the two samples is 1.525 according to the simulation
parameters, so 95% of 95% confidence intervals that hierarch calculates should contain
this value. You can test this with the following code. ::

    true_difference = 1.525
    coverage = 0
    loops = 1000

    for i in range(loops):
        data = sim.generate()
        lower, upper = confidence_interval(data, 0, interval=95, 
                                           bootstraps=100, permutations='all')
        if lower <= true_difference <= upper:
            coverage += 1

    print("Coverage:", coverage/loops)
    
    Coverage: 0.946

This is within the Monte Carlo error of the simulation (+/- 0.7%) of 95%, so we can feel
confident in this method of interval computation.

Regression Coefficient Confidence Intervals
-------------------------------------------
The confidence_interval function can also be used on many-sample datasets that represent
a hypothesized linear relationship. Let's generate a dataset with a "true" slope of 
2/3. ::

    paramlist = [[0, 2/3, 4/3, 2], [stats.norm], [stats.norm]]
    hierarchy = [4, 2, 3]
    datagen = DataSimulator(paramlist, random_state=2)
    datagen.fit(hierarchy)
    data = datagen.generate()
    data

+---+---+---+----------+
| 0 | 1 | 2 | 3        |
+===+===+===+==========+
| 1 | 1 | 1 | 0.470264 |
+---+---+---+----------+
| 1 | 1 | 2 | -0.36477 |
+---+---+---+----------+
| 1 | 1 | 3 | 1.166621 |
+---+---+---+----------+
| 1 | 2 | 1 | -0.8333  |
+---+---+---+----------+
| 1 | 2 | 2 | -0.85157 |
+---+---+---+----------+
| 1 | 2 | 3 | -1.3149  |
+---+---+---+----------+
| 2 | 1 | 1 | 0.708561 |
+---+---+---+----------+
| 2 | 1 | 2 | 0.154405 |
+---+---+---+----------+
| 2 | 1 | 3 | 0.798892 |
+---+---+---+----------+
| 2 | 2 | 1 | -2.38199 |
+---+---+---+----------+
| 2 | 2 | 2 | -1.64797 |
+---+---+---+----------+
| 2 | 2 | 3 | -2.66707 |
+---+---+---+----------+
| 3 | 1 | 1 | 3.974506 |
+---+---+---+----------+
| 3 | 1 | 2 | 3.321076 |
+---+---+---+----------+
| 3 | 1 | 3 | 3.463612 |
+---+---+---+----------+
| 3 | 2 | 1 | 2.888003 |
+---+---+---+----------+
| 3 | 2 | 2 | 1.466742 |
+---+---+---+----------+
| 3 | 2 | 3 | 3.26068  |
+---+---+---+----------+
| 4 | 1 | 1 | 3.73128  |
+---+---+---+----------+
| 4 | 1 | 2 | 0.036135 |
+---+---+---+----------+
| 4 | 1 | 3 | -0.05483 |
+---+---+---+----------+
| 4 | 2 | 1 | 1.268975 |
+---+---+---+----------+
| 4 | 2 | 2 | 3.615265 |
+---+---+---+----------+
| 4 | 2 | 3 | 2.902522 |
+---+---+---+----------+

You can compute a confidence interval in the same manner as above. This time, you should set the
**compare** keyword argument to "corr" for clarity, but "corr" is also the default setting
for **compare** when computing a confidence interval. ::

    from hierarch.stats import confidence_interval

    confidence_interval(
    data,
    treatment_col=0,
    compare='corr',
    interval=95,
    bootstraps=500,
    permutations="all",
    random_state=1,
    )

    (0.3410887712843298, 1.7540918236455125)

This confidence interval corresponds to the slope in a linear model. You can check this by
computing the slope coefficient via Ordinary Least Squares. ::

    import scipy.stats
    from hierarch.internal_functions import GroupbyMean

    grouper = GroupbyMean()
    test = grouper.fit_transform(data)
    stats.linregress(test[:,0], test[:,-1])

    LinregressResult(slope=1.0515132531203024, intercept=-1.6658194480556106, 
    rvalue=0.6444075548383587, pvalue=0.08456152533094284, 
    stderr=0.5094006523081002, intercept_stderr=1.3950511403849626)

The slope, 1.0515, is indeed in the center of our computed interval (within Monte Carlo error).

Again, it is worthwhile to check that confidence_interval is performing adequately. You can
set up a simulation as above to check the coverage of the 95% confidence interval. ::

    true_difference = 2/3
    coverage = 0
    loops = 1000

    for i in range(loops):
        data = datagen.generate()
        lower, upper = confidence_interval(data, 0, interval=95, 
                                           bootstraps=100, permutations='all')
        if lower <= true_difference <= upper:
            coverage += 1

    print(coverage/loops)

    0.956

This is within the Monte Carlo error of the simulation (+/- 0.7%) of 95% and therefore
acceptable. You can check the coverage of other intervals by changing the **interval** keyword
argument, though be aware that Monte Carlo error depends on the probability of the event of
interest. ::

    true_difference = 2/3
    coverage = 0
    loops = 1000

    for i in range(loops):
        data = datagen.generate()
        lower, upper = confidence_interval(data, 0, interval=99, 
                                           bootstraps=100, permutations='all')
        if lower <= true_difference <= upper:
            coverage += 1

    print(coverage/loops)

    0.99

Using the confidence_interval function, researchers can rapidly calculate confidence intervals for
effect sizes that maintain nominal coverage without worrying about distributional assumptions.