toad.stats module¶
- toad.stats.gini(target)[source]¶
get gini index of a feature
- Parameters
target (array-like) – list of target that will be calculate gini
- Returns
gini value
- Return type
number
- toad.stats.gini_cond(feature, target)[source]¶
get conditional gini index of a feature
- Parameters
feature (array-like) –
target (array-like) –
- Returns
conditional gini value. If feature is continuous, it will return the best gini value when the feature bins into two groups
- Return type
number
- toad.stats.entropy(target)[source]¶
get infomation entropy of a feature
- Parameters
target (array-like) –
- Returns
information entropy
- Return type
number
- toad.stats.entropy_cond(feature, target)[source]¶
get conditional entropy of a feature
- Parameters
feature (array-like) –
target (array-like) –
- Returns
conditional information entropy. If feature is continuous, it will return the best entropy when the feature bins into two groups
- Return type
number
- toad.stats.WOE(y_prob, n_prob)[source]¶
get WOE of a group
- Parameters
y_prob – the probability of grouped y in total y
n_prob – the probability of grouped n in total n
- Returns
woe value
- Return type
number
- toad.stats.IV(feature, target, return_sub=False, **kwargs)[source]¶
get the IV of a feature
- Parameters
feature (array-like) –
target (array-like) –
return_sub (bool) – if need return IV of each groups
n_bins (int) – n groups that the feature will bin into
method (str) – the strategy to be used to merge feature, default is ‘dt’
() (**kwargs) – other options for merge function
- toad.stats.badrate(target)[source]¶
calculate badrate
- Parameters
target (array-like) – target array which 1 is bad
- Returns
float
- class toad.stats.indicator(*args, is_class=False, **kwargs)[source]¶
Bases:
Decoratorindicator decorator
- toad.stats.column_quality(feature, target, name='feature', indicators=[], need_merge=False, **kwargs)[source]¶
calculate quality of a feature
- Parameters
feature (array-like) –
target (array-like) –
name (str) – feature’s name that will be setted in the returned Series
indicators (list) – list of indicator functions
need_merge (bool) – if need merge feature
- Returns
a list of quality with the feature’s name
- Return type
Series
- toad.stats.quality(dataframe, target='target', cpu_cores=0, iv_only=False, indicators=['iv', 'gini', 'entropy', 'unique'], **kwargs)[source]¶
get quality of features in data
- Parameters
dataframe (DataFrame) – dataframe that will be calculate quality
target (str) – the target’s name in dataframe
iv_only (bool) – deprecated. if only calculate IV
indicators (list) – indictors will be calc, it can be customized indictor functions, default is [‘iv’, ‘gini’, ‘entropy’, ‘unique’]
cpu_cores (int) – the maximun number of CPU cores will be used, 0 means all CPUs will be used, -1 means all CPUs but one will be used.
- Returns
quality of features with the features’ name as row name
- Return type
DataFrame