hierarch.resampling.Permuter
- class hierarch.resampling.Permuter(random_state: Generator | int | None = None)
Bases:
objectClass for performing cluster-aware permutation on a target column.
- Parameters:
- random_stateint or numpy.random.Generator instance, optional
Seedable for reproducibility, by default None
Examples
When the column to resample is the first column, Permuter performs an ordinary shuffle.
>>> from hierarch.power import DataSimulator >>> from hierarch.internal_functions import GroupbyMean >>> paramlist = [[1]*2, [0]*6, [0]*18] >>> hierarchy = [2, 3, 3] >>> datagen = DataSimulator(paramlist) >>> datagen.fit(hierarchy) >>> data = datagen.generate() >>> agg = GroupbyMean() >>> test = agg.fit_transform(data) >>> test array([[1., 1., 1.], [1., 2., 1.], [1., 3., 1.], [2., 1., 1.], [2., 2., 1.], [2., 3., 1.]])
Permuter performs an in-place shuffle on the fitted data.
>>> permute = Permuter(random_state=1) >>> permute.fit(test, col_to_permute=0, exact=False) >>> permute.transform(test) array([[2., 1., 1.], [2., 2., 1.], [1., 3., 1.], [2., 1., 1.], [1., 2., 1.], [1., 3., 1.]])
If exact=True, Permuter will not repeat a permutation until all possible permutations have been exhausted.
>>> test = agg.fit_transform(data) >>> permute = Permuter(random_state=1) >>> permute.fit(test, col_to_permute=0, exact=True) >>> permute.transform(test) array([[2., 1., 1.], [2., 2., 1.], [2., 3., 1.], [1., 1., 1.], [1., 2., 1.], [1., 3., 1.]]) >>> next(permute.iterator) [1.0, 2.0, 2.0, 2.0, 1.0, 1.0] >>> next(permute.iterator) [2.0, 1.0, 2.0, 2.0, 1.0, 1.0]
If the column to permute is not 0, Permuter performs a within-cluster shuffle. Note that values of column 1 were shuffled within their column 0 cluster.
>>> test = agg.fit_transform(data) >>> permute = Permuter(random_state=2) >>> permute.fit(test, col_to_permute=1, exact=False) >>> permute.transform(test) array([[1., 1., 1.], [1., 2., 1.], [1., 3., 1.], [2., 2., 1.], [2., 1., 1.], [2., 3., 1.]])
Exact within-cluster permutations are not implemented, but there are typically too many to be worth attempting.
>>> permute = Permuter(random_state=2) >>> permute.fit(test, col_to_permute=1, exact=True) Traceback (most recent call last): ... NotImplementedError: Exact permutation only available for col_to_permute = 0.
Methods
fit(data, col_to_permute[, exact])Fit the permuter to the target data.
transform(data)Permute target column in-place.
- fit(data: ndarray, col_to_permute: int, exact: bool = False) None
Fit the permuter to the target data.
- Parameters:
- data2D numeric ndarray
Target data.
- col_to_permuteint
Index of target column.
- exactbool, optional
If True, will enumerate all possible permutations and iterate through them one by one, by default False. Only works if target column has index 0.
- transform(data: ndarray) ndarray
Permute target column in-place.
- Parameters:
- data2D numeric ndarray
Target data.
- Returns:
- data2D numeric ndarray
Original data with target column shuffled, in a stratified fashion if necessary.