hierarch.stats.welch_statistic

hierarch.stats.welch_statistic(sample_a, sample_b)

Calculates Welch’s t statistic.

Takes a 2D data matrix, a column to classify data by, and the labels corresponding to the data of interest. Assumes that the largest (-1) column in the data matrix is the dependent variable.

Parameters:
data2D array

Data matrix. Assumes last column contains dependent variable values.

colint

Target column to be used to divide the dependent variable into two groups.

treatment_labels1D array-like

Labels in target column to be used.

Returns:
float64

Welch’s t statistic.

Notes

Details on the validity of this test statistic can be found in “Studentized permutation tests for non-i.i.d. hypotheses and the generalized Behrens-Fisher problem” by Arnold Janssen. https://doi.org/10.1016/S0167-7152(97)00043-6.

Examples

>>> import scipy.stats as stats
>>> a = np.array([1, 2, 3, 4, 5])
>>> b = np.array([10, 11, 12, 13, 14])
>>> welch_statistic(a, b)
-9.0

This uses the same calculation as scipy’s ttest function.

>>> import scipy.stats as stats
>>> a = np.array([1, 2, 3, 4, 5])
>>> b = np.array([10, 11, 12, 13, 14])
>>> stats.ttest_ind(a, b, equal_var=False)[0]
-9.0