toad.stats module¶
-
toad.stats.
gini
(target)[source]¶ get gini index of a feature
Parameters: target (array-like) – list of target that will be calculate gini Returns: gini value Return type: number
-
toad.stats.
gini_cond
[source]¶ get conditional gini index of a feature
Parameters: - feature (array-like) –
- target (array-like) –
Returns: conditional gini value. If feature is continuous, it will return the best gini value when the feature bins into two groups
Return type: number
-
toad.stats.
entropy
(target)[source]¶ get infomation entropy of a feature
Parameters: target (array-like) – Returns: information entropy Return type: number
-
toad.stats.
entropy_cond
[source]¶ get conditional entropy of a feature
Parameters: - feature (array-like) –
- target (array-like) –
Returns: conditional information entropy. If feature is continuous, it will return the best entropy when the feature bins into two groups
Return type: number
-
toad.stats.
WOE
(y_prob, n_prob)[source]¶ get WOE of a group
Parameters: - y_prob – the probability of grouped y in total y
- n_prob – the probability of grouped n in total n
Returns: woe value
Return type: number
-
toad.stats.
IV
[source]¶ get the IV of a feature
Parameters: - feature (array-like) –
- target (array-like) –
- return_sub (bool) – if need return IV of each groups
- n_bins (int) – n groups that the feature will bin into
- method (str) – the strategy to be used to merge feature, default is ‘dt’
- () (**kwargs) – other options for merge function
-
toad.stats.
badrate
(target)[source]¶ calculate badrate
Parameters: target (array-like) – target array which 1 is bad Returns: float
-
class
toad.stats.
indicator
(*args, is_class=False, **kwargs)[source]¶ Bases:
toad.utils.decorator.Decorator
indicator decorator
-
toad.stats.
column_quality
(feature, target, name='feature', indicators=[], need_merge=False, **kwargs)[source]¶ calculate quality of a feature
Parameters: - feature (array-like) –
- target (array-like) –
- name (str) – feature’s name that will be setted in the returned Series
- indicators (list) – list of indicator functions
- need_merge (bool) – if need merge feature
Returns: a list of quality with the feature’s name
Return type: Series
-
toad.stats.
quality
(dataframe, target='target', cpu_cores=0, iv_only=False, indicators=['iv', 'gini', 'entropy', 'unique'], **kwargs)[source]¶ get quality of features in data
Parameters: - dataframe (DataFrame) – dataframe that will be calculate quality
- target (str) – the target’s name in dataframe
- iv_only (bool) – deprecated. if only calculate IV
- cpu_cores (int) – the maximun number of CPU cores will be used, 0 means all CPUs will be used, -1 means all CPUs but one will be used.
Returns: quality of features with the features’ name as row name
Return type: DataFrame