toad.transform module¶
-
class
toad.transform.
Transformer
[source]¶ Bases:
sklearn.base.TransformerMixin
,toad.utils.mixin.RulesMixin
Base class for transformers
-
fit
()¶ fit method, see details in fit_ method
-
export
(**kwargs)[source]¶ export rules to dict or a json file
Parameters: to_json (str|IOBase) – json file to save rules Returns: dictionary of rules Return type: dict
-
fit_transform
(X, y=None, **fit_params)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X (array-like of shape (n_samples, n_features)) – Input samples.
- y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
- **fit_params (dict) – Additional fit parameters.
Returns: X_new – Transformed array.
Return type: ndarray array of shape (n_samples, n_features_new)
-
load
(rules, update=False, **kwargs)[source]¶ load rules from dict or json file
Parameters: - rules (dict) – dictionary of rules
- from_json (str|IOBase) – json file of rules
- update (bool) – if need to use updating instead of replacing rules
-
rules
¶
-
-
class
toad.transform.
WOETransformer
[source]¶ Bases:
toad.transform.Transformer
WOE transformer
-
fit_
(X, y)[source]¶ fit WOE transformer
Parameters: - X (DataFrame|array-like) –
- y (str|array-like) –
- select_dtypes (str|numpy.dtypes) – ‘object’, ‘number’ etc. only selected dtypes will be transform
-
transform_
(rule, X, default='min')[source]¶ transform function for single feature
Parameters: - X (array-like) –
- default (str) – ‘min’(default), ‘max’ - the strategy to be used for unknown group
Returns: array-like
-
export
(**kwargs)[source]¶ export rules to dict or a json file
Parameters: to_json (str|IOBase) – json file to save rules Returns: dictionary of rules Return type: dict
-
fit
()¶ fit method, see details in fit_ method
-
fit_transform
(X, y=None, **fit_params)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X (array-like of shape (n_samples, n_features)) – Input samples.
- y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
- **fit_params (dict) – Additional fit parameters.
Returns: X_new – Transformed array.
Return type: ndarray array of shape (n_samples, n_features_new)
-
load
(rules, update=False, **kwargs)[source]¶ load rules from dict or json file
Parameters: - rules (dict) – dictionary of rules
- from_json (str|IOBase) – json file of rules
- update (bool) – if need to use updating instead of replacing rules
-
rules
¶
-
-
class
toad.transform.
Combiner
[source]¶ Bases:
toad.transform.Transformer
,toad.utils.mixin.BinsMixin
Combiner for merge data
-
fit_
(X, y=None, method='chi', empty_separate=False, **kwargs)[source]¶ fit combiner
Parameters: - X (DataFrame|array-like) – features to be combined
- y (str|array-like) – target data or name of target in X
- method (str) – the strategy to be used to merge X, same as .merge, default is chi
- n_bins (int) – counts of bins will be combined
- empty_separate (bool) – if need to combine empty values into a separate group
-
transform_
(rule, X, labels=False, ellipsis=16, **kwargs)[source]¶ transform X by combiner
Parameters: - X (DataFrame|array-like) – features to be transformed
- labels (bool) – if need to use labels for resulting bins, False by default
- ellipsis (int) – max length threshold that labels will not be ellipsis, None for skipping ellipsis
Returns: array-like
-
set_rules
(map, reset=False)[source]¶ set rules for combiner
Parameters: - map (dict|array-like) – map of splits
- reset (bool) – if need to reset combiner
Returns: self
-
ELSE_GROUP
= 'else'¶
-
EMPTY_BIN
= -1¶
-
NUMBER_EXP
= re.compile('\\[(-inf|-?\\d+(.\\d+)?)\\s*[~-]\\s*(inf|-?\\d+(.\\d+)?)\\)')¶
-
export
(**kwargs)[source]¶ export rules to dict or a json file
Parameters: to_json (str|IOBase) – json file to save rules Returns: dictionary of rules Return type: dict
-
fit
()¶ fit method, see details in fit_ method
-
fit_transform
(X, y=None, **fit_params)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X (array-like of shape (n_samples, n_features)) – Input samples.
- y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
- **fit_params (dict) – Additional fit parameters.
Returns: X_new – Transformed array.
Return type: ndarray array of shape (n_samples, n_features_new)
-
classmethod
format_bins
(bins, index=False, ellipsis=None)[source]¶ format bins to label
Parameters: - bins (ndarray) – bins to format
- index (bool) – if need index prefix
- ellipsis (int) – max length threshold that labels will not be ellipsis, None for skipping ellipsis
Returns: array of labels
Return type: ndarray
-
load
(rules, update=False, **kwargs)[source]¶ load rules from dict or json file
Parameters: - rules (dict) – dictionary of rules
- from_json (str|IOBase) – json file of rules
- update (bool) – if need to use updating instead of replacing rules
-
rules
¶
-
-
class
toad.transform.
GBDTTransformer
[source]¶ Bases:
toad.transform.Transformer
GBDT transformer
-
fit_
(X, y, **kwargs)[source]¶ fit GBDT transformer
Parameters: - X (DataFrame|array-like) –
- y (str|array-like) –
- select_dtypes (str|numpy.dtypes) – ‘object’, ‘number’ etc. only selected dtypes will be transform,
-
transform_
(rules, X)[source]¶ transform woe
Parameters: X (DataFrame|array-like) – Returns: array-like
-
export
(**kwargs)[source]¶ export rules to dict or a json file
Parameters: to_json (str|IOBase) – json file to save rules Returns: dictionary of rules Return type: dict
-
fit
()¶ fit method, see details in fit_ method
-
fit_transform
(X, y=None, **fit_params)[source]¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X (array-like of shape (n_samples, n_features)) – Input samples.
- y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
- **fit_params (dict) – Additional fit parameters.
Returns: X_new – Transformed array.
Return type: ndarray array of shape (n_samples, n_features_new)
-
load
(rules, update=False, **kwargs)[source]¶ load rules from dict or json file
Parameters: - rules (dict) – dictionary of rules
- from_json (str|IOBase) – json file of rules
- update (bool) – if need to use updating instead of replacing rules
-
rules
¶
-