mlshell.pipeline¶
The mlshell.pipeline contains pipeline related utils.
Classes
|
Unified pipeline steps. |
-
class
mlshell.pipeline.Steps(estimator, estimator_type=None, th_step=False)¶ Bases:
objectUnified pipeline steps.
- Parameters
estimator (
sklearnestimator) –Estimator to use in the last step. If
estimator_type=regressor:sklearn.compose.TransformedTargetRegressor(regressor=`estimator`)Ifestimator_type=classifierandth_step=True: ``sklearn.pipeline.Pipeline(steps=[- (‘predict_proba’,
mlshell.model_selection.PredictionTransformer(estimator)),
- (‘apply_threshold’,
- mlshell.model_selection.ThresholdClassifier(threshold=0.5,
kwargs=’auto’)),
])``
If
estimator_type=classifierandth_step=False:sklearn.pipeline.Pipeline(steps=[('classifier', `estimator`)])estimator_type (str {'classifier`, 'regressor'}, optional (default=None)) – Either regression or classification task. If None, get from
sklearn.base.is_classifier()onestimator.th_step (bool) – If True and
estimator_type=classifier:mlshell.model_selection. ThresholdClassifiersub-step added, otherwise ignored.
Notes
Assembling steps in class are made for convenience. Use steps property to access after initialization. Only OneHot encoder and imputer steps are initially activated. By default, 4 parameters await for resolution (‘auto’):
‘process_parallel__pipeline_categoric__select_columns__kw_args’ ‘process_parallel__pipeline_numeric__select_columns__kw_args’ ‘estimate__apply_threshold__threshold’ ‘estimate__apply_threshold__params’
Set corresponding parameters with
set_params()to overwrite default in created pipeline or usemlshell.model_selection.Resolver.‘pass_custom’ step allows brute force arbitrary parameters in uniform style with pipeline hp (as if score contains additional nested loops). Step name is hard-coded and could not be changed.
‘apply_threshold’ allows grid search classification thresholds as pipeline hyper-parameter.
‘estimate’ step should be the last.
- Attributes
stepslist : access steps to pass in sklearn.pipeline.Pipeline .
Methods
bining_mask(x)Get features indices which need bining.
last_step(estimator, estimator_type, th_step)Prepare estimator step.
scorer_kwargs(x, **kw_args)Mock function to custom kwargs setting.
subcolumns(x, **kw_args)Get sub-columns from x.
subrows(x)Get rows from x.
-
last_step(estimator, estimator_type, th_step)¶ Prepare estimator step.
-
property
steps¶ list : access steps to pass in sklearn.pipeline.Pipeline .
-
scorer_kwargs(x, **kw_args)¶ Mock function to custom kwargs setting.
- Parameters
x (
numpy.ndarrayorpandas.DataFrame) – Features of shape [n_samples, n_features].**kw_args (dict) – Step parameters. Could be extracted from pipeline in scorer if needed.
- Returns
result – Unchanged
x.- Return type
-
subcolumns(x, **kw_args)¶ Get sub-columns from x.
- Parameters
x (
numpy.ndarrayorpandas.DataFrame) – Features of shape [n_samples, n_features].**kw_args (dict) – Columns indices to extract: {‘indices’: array-like}.
- Returns
result – Extracted sub-columns of
x.- Return type
-
subrows(x)¶ Get rows from x.
-
bining_mask(x)¶ Get features indices which need bining.