toad.merge module¶
- toad.merge.ChiMerge(feature, target, n_bins=None, min_samples=None, min_threshold=None, nan=-1, balance=True)¶
Chi-Merge
- Parameters
feature (array-like) – feature to be merged
target (array-like) – a array of target classes
n_bins (int) – n bins will be merged into
min_samples (number) – min sample in each group, if float, it will be the percentage of samples
min_threshold (number) – min threshold of chi-square
- Returns
array of split points
- Return type
array
- toad.merge.DTMerge(feature, target, nan=-1, n_bins=None, min_samples=1, **kwargs)¶
Merge by Decision Tree
- Parameters
feature (array-like) –
target (array-like) – target will be used to fit decision tree
nan (number) – value will be used to fill nan
n_bins (int) – n groups that will be merged into
min_samples (int) – min number of samples in each leaf nodes
- Returns
array of split points
- Return type
array
- toad.merge.KMeansMerge(feature, target=None, nan=-1, n_bins=None, random_state=1)¶
Merge by KMeans
- Parameters
feature (array-like) –
target (array-like) – target will be used to fit kmeans model
nan (number) – value will be used to fill nan
n_bins (int) – n groups that will be merged into
random_state (int) – random state will be used for kmeans model
- Returns
split points of feature
- Return type
array
- toad.merge.QuantileMerge(feature, nan=-1, n_bins=None, q=None)¶
Merge by quantile
- Parameters
feature (array-like) –
nan (number) – value will be used to fill nan
n_bins (int) – n groups that will be merged into
q (array-like) – list of percentage split points
- Returns
split points of feature
- Return type
array
- toad.merge.StepMerge(feature, nan=None, n_bins=None, clip_v=None, clip_std=None, clip_q=None)¶
Merge by step
- Parameters
feature (array-like) –
nan (number) – value will be used to fill nan
n_bins (int) – n groups that will be merged into
clip_v (number | tuple) – min/max value of clipping
clip_std (number | tuple) – min/max std of clipping
clip_q (number | tuple) – min/max quantile of clipping
- Returns
split points of feature
- Return type
array
- toad.merge.merge(feature, target=None, method='dt', return_splits=False, **kwargs)¶
merge feature into groups
- Parameters
feature (array-like) –
target (array-like) –
method (str) – ‘dt’, ‘chi’, ‘quantile’, ‘step’, ‘kmeans’ - the strategy to be used to merge feature
return_splits (bool) – if needs to return splits
n_bins (int) – n groups that will be merged into
- Returns
a array of merged label with the same size of feature array: list of split points
- Return type
array