hierarch.stats.welch_statistic¶
- hierarch.stats.welch_statistic(data, col: int, treatment_labels)¶
Calculates Welch’s t statistic.
Takes a 2D data matrix, a column to classify data by, and the labels corresponding to the data of interest. Assumes that the largest (-1) column in the data matrix is the dependent variable.
- Parameters
- data2D array
Data matrix. Assumes last column contains dependent variable values.
- colint
Target column to be used to divide the dependent variable into two groups.
- treatment_labels1D array-like
Labels in target column to be used.
- Returns
- float64
Welch’s t statistic.
Notes
Details on the validity of this test statistic can be found in “Studentized permutation tests for non-i.i.d. hypotheses and the generalized Behrens-Fisher problem” by Arnold Janssen. https://doi.org/10.1016/S0167-7152(97)00043-6.
Examples
>>> x = np.array([[0, 0, 0, 0, 0, 1, 1, 1, 1, 1], ... [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]]) >>> x.T array([[ 0, 1], [ 0, 2], [ 0, 3], [ 0, 4], [ 0, 5], [ 1, 10], [ 1, 11], [ 1, 12], [ 1, 13], [ 1, 14]]) >>> welch_statistic(x.T, 0, (0, 1)) -9.0
This uses the same calculation as scipy’s ttest function.
>>> import scipy.stats as stats >>> a = [1, 2, 3, 4, 5] >>> b = [10, 11, 12, 13, 14] >>> stats.ttest_ind(a, b, equal_var=False)[0] -9.0