toad.preprocessing.process module¶

class toad.preprocessing.process.Processing(data)[source]¶

Bases: object

Example:

>>> (Processing(data)
...     .groupby('id')
...     .partitionby(TimePartition(
...         'base_time',
...         'filter_time',
...         ['30d', '60d', '180d', '365d', 'all']
...     ))
...     .apply({'A': ['max', 'min', 'mean']})
...     .apply({'B': ['max', 'min', 'mean']})
...     .apply({'C': 'nunique'})
...     .apply({'D': {
...         'f': len,
...         'name': 'normal_count',
...         'mask':  Mask('D').isin(['normal']),
...     }})
...     .apply({'id': 'count'})
...     .exec()
... )

groupby(name)[source]¶

group data by name

Parameters:	name (str) – column name in data

apply(f)[source]¶

apply functions to data

Parameters:	f (dict\|function) – a config dict that keys are the column names and values are the functions, it will take the column series as the functions argument. if f is a function, it will take the whole dataframe as the argument.

append_func(col, func)[source]¶

partitionby(p)[source]¶

partition data to multiple pieces, processing will process to all the pieces

Parameters:	p (Partition) –

exec()[source]¶

process(data)[source]¶

class toad.preprocessing.process.Mask(column=None)[source]¶

Bases: object

a placeholder to select dataframe

push(op, value)[source]¶

replay(data)[source]¶

isin(other)[source]¶

isna()[source]¶

class toad.preprocessing.process.F(f, name=None, mask=None)[source]¶

Bases: object

function class for processing

name¶

is_buildin¶

need_filter¶

filter(data)[source]¶

toad.preprocessing.process module¶

Related Topics

This Page