toad.transform module

class toad.transform.Transformer[source]

Bases: TransformerMixin, RulesMixin

Base class for transformers

fit(X, *args, update=False, **kwargs)[source]

fit method, see details in fit_ method

transform(X, *args, **kwargs)[source]

transform method, see details in transform_ method

export(**kwargs)[source]

export rules to dict or a json file

Parameters

to_json (str|IOBase) – json file to save rules

Returns

dictionary of rules

Return type

dict

fit_transform(X, y=None, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

load(rules, update=False, **kwargs)[source]

load rules from dict or json file

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules

  • update (bool) – if need to use updating instead of replacing rules

set_output(*, transform=None)[source]

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • ”polars”: Polars output

  • None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns

self – Estimator instance.

Return type

estimator instance

update(*args, **kwargs)[source]

update rules

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules

class toad.transform.WOETransformer[source]

Bases: Transformer

WOE transformer

fit_(X, y)[source]

fit WOE transformer

Parameters
  • X (DataFrame|array-like) –

  • y (str|array-like) –

  • select_dtypes (str|numpy.dtypes) – ‘object’, ‘number’ etc. only selected dtypes will be transform

transform_(rule, X, default='min')[source]

transform function for single feature

Parameters
  • X (array-like) –

  • default (str) – ‘min’(default), ‘max’ - the strategy to be used for unknown group

Returns

array-like

export(**kwargs)[source]

export rules to dict or a json file

Parameters

to_json (str|IOBase) – json file to save rules

Returns

dictionary of rules

Return type

dict

fit(X, *args, update=False, **kwargs)[source]

fit method, see details in fit_ method

fit_transform(X, y=None, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

load(rules, update=False, **kwargs)[source]

load rules from dict or json file

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules

  • update (bool) – if need to use updating instead of replacing rules

set_output(*, transform=None)[source]

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • ”polars”: Polars output

  • None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns

self – Estimator instance.

Return type

estimator instance

transform(X, *args, **kwargs)[source]

transform method, see details in transform_ method

update(*args, **kwargs)[source]

update rules

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules

class toad.transform.Combiner[source]

Bases: Transformer, BinsMixin

Combiner for merge data

fit_(X, y=None, method='chi', empty_separate=False, **kwargs)[source]

fit combiner

Parameters
  • X (DataFrame|array-like) – features to be combined

  • y (str|array-like) – target data or name of target in X

  • method (str) – the strategy to be used to merge X, same as .merge, default is chi

  • n_bins (int) – counts of bins will be combined

  • empty_separate (bool) – if need to combine empty values into a separate group

transform_(rule, X, labels=False, ellipsis=16, **kwargs)[source]

transform X by combiner

Parameters
  • X (DataFrame|array-like) – features to be transformed

  • labels (bool) – if need to use labels for resulting bins, False by default

  • ellipsis (int) – max length threshold that labels will not be ellipsis, None for skipping ellipsis

Returns

array-like

set_rules(map, reset=False)[source]

set rules for combiner

Parameters
  • map (dict|array-like) – map of splits

  • reset (bool) – if need to reset combiner

Returns

self

export(**kwargs)[source]

export rules to dict or a json file

Parameters

to_json (str|IOBase) – json file to save rules

Returns

dictionary of rules

Return type

dict

fit(X, *args, update=False, **kwargs)[source]

fit method, see details in fit_ method

fit_transform(X, y=None, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

classmethod format_bins(bins, index=False, ellipsis=None)[source]

format bins to label

Parameters
  • bins (ndarray) – bins to format

  • index (bool) – if need index prefix

  • ellipsis (int) – max length threshold that labels will not be ellipsis, None for skipping ellipsis

Returns

array of labels

Return type

ndarray

load(rules, update=False, **kwargs)[source]

load rules from dict or json file

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules

  • update (bool) – if need to use updating instead of replacing rules

classmethod parse_bins(bins)[source]

parse labeled bins to array

set_output(*, transform=None)[source]

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • ”polars”: Polars output

  • None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns

self – Estimator instance.

Return type

estimator instance

transform(X, *args, **kwargs)[source]

transform method, see details in transform_ method

update(*args, **kwargs)[source]

update rules

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules

class toad.transform.GBDTTransformer[source]

Bases: Transformer

GBDT transformer

__init__()[source]
fit_(X, y, **kwargs)[source]

fit GBDT transformer

Parameters
  • X (DataFrame|array-like) –

  • y (str|array-like) –

  • select_dtypes (str|numpy.dtypes) – ‘object’, ‘number’ etc. only selected dtypes will be transform,

transform_(rules, X)[source]

transform woe

Parameters

X (DataFrame|array-like) –

Returns

array-like

export(**kwargs)[source]

export rules to dict or a json file

Parameters

to_json (str|IOBase) – json file to save rules

Returns

dictionary of rules

Return type

dict

fit(X, *args, update=False, **kwargs)[source]

fit method, see details in fit_ method

fit_transform(X, y=None, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

load(rules, update=False, **kwargs)[source]

load rules from dict or json file

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules

  • update (bool) – if need to use updating instead of replacing rules

set_output(*, transform=None)[source]

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • ”polars”: Polars output

  • None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns

self – Estimator instance.

Return type

estimator instance

transform(X, *args, **kwargs)[source]

transform method, see details in transform_ method

update(*args, **kwargs)[source]

update rules

Parameters
  • rules (dict) – dictionary of rules

  • from_json (str|IOBase) – json file of rules