hierarch.stats.welch_statistic¶

hierarch.stats.welch_statistic(data, col: int, treatment_labels)¶

Calculates Welch’s t statistic.

Takes a 2D data matrix, a column to classify data by, and the labels corresponding to the data of interest. Assumes that the largest (-1) column in the data matrix is the dependent variable.

Parameters

data2D array: Data matrix. Assumes last column contains dependent variable values.
colint: Target column to be used to divide the dependent variable into two groups.
treatment_labels1D array-like: Labels in target column to be used.

Returns

float64: Welch’s t statistic.

Notes

Details on the validity of this test statistic can be found in “Studentized permutation tests for non-i.i.d. hypotheses and the generalized Behrens-Fisher problem” by Arnold Janssen. https://doi.org/10.1016/S0167-7152(97)00043-6.

Examples

>>> x = np.array([[0, 0, 0, 0, 0, 1, 1, 1, 1, 1], 
...               [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]])
>>> x.T
array([[ 0,  1],
       [ 0,  2],
       [ 0,  3],
       [ 0,  4],
       [ 0,  5],
       [ 1, 10],
       [ 1, 11],
       [ 1, 12],
       [ 1, 13],
       [ 1, 14]])
>>> welch_statistic(x.T, 0, (0, 1))
-9.0

This uses the same calculation as scipy’s ttest function.

>>> import scipy.stats as stats
>>> a = [1, 2, 3, 4, 5]
>>> b = [10, 11, 12, 13, 14]
>>> stats.ttest_ind(a, b, equal_var=False)[0]
-9.0