mlshell.PipelineProducer¶
-
class
mlshell.
PipelineProducer
(objects, oid, path_id='path__default', logger_id='logger__default')[source]¶ Bases:
pycnfg.producer.Producer
Factory to produce pipeline.
Interface: set, make, load, info.
- Parameters
objects (dict) – Dictionary with objects from previous executed producers: {‘section_id__config__id’, object,}
oid (str) – Unique identifier of produced object.
path_id (str, optional (default='default')) – Project path identifier in objects.
logger_id (str, optional (default='default')) – Logger identifier in objects.
-
objects
¶ Dictionary with objects from previous executed producers: {‘section_id__config__id’, object,}
- Type
-
logger
¶ Logger.
- Type
Methods
dict_api
(obj[, method])Forwarding api for dictionary object.
dump_cache
(obj[, prefix, cachedir, pkg])Dump intermediate object state to IO.
info
(pipeline, **kwargs)Log pipeline info.
load
(pipeline, filepath, **kwargs)Load fitted model from disk.
load_cache
(obj[, prefix, cachedir, pkg])Load intermediate object state from IO.
make
(pipeline[, steps, estimator, memory])Create pipeline from steps.
run
(init, steps)Execute configuration steps.
set
(pipeline, estimator)Set estimator as pipeline.
update
(obj, items)Update key(s) for dictionary object.
-
__init__
(objects, oid, path_id='path__default', logger_id='logger__default')[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(objects, oid[, path_id, logger_id])Initialize self.
dict_api
(obj[, method])Forwarding api for dictionary object.
dump_cache
(obj[, prefix, cachedir, pkg])Dump intermediate object state to IO.
info
(pipeline, **kwargs)Log pipeline info.
load
(pipeline, filepath, **kwargs)Load fitted model from disk.
load_cache
(obj[, prefix, cachedir, pkg])Load intermediate object state from IO.
make
(pipeline[, steps, estimator, memory])Create pipeline from steps.
run
(init, steps)Execute configuration steps.
set
(pipeline, estimator)Set estimator as pipeline.
update
(obj, items)Update key(s) for dictionary object.
-
set
(pipeline, estimator)[source]¶ Set estimator as pipeline.
- Parameters
pipeline (
mlshell.Pipeline
) – Pipeline object, will be updated.estimator (
sklearn
estimator) – Estimator to set inmlshell.Pipeline
interface.
- Returns
pipeline – Resulted pipeline.
- Return type
-
make
(pipeline, steps=None, estimator=None, memory=None, **kwargs)[source]¶ Create pipeline from steps.
- Parameters
pipeline (
mlshell.Pipeline
) – Pipeline object, will be updated.steps (list, class, optional (default=None)) – Pipeline steps to pass in
sklearn.pipeline.Pipeline
. Could be a class withsteps
attribute, will be initializedsteps(estimator, **kwargs)
. If None,mlshell.pipeline. Steps
is used. Set[]
to use``estimator`` direct as pipeline.estimator (
sklearn
estimator) – Estimator to use in the last step ifsteps
is a class or direct ifsteps=[]
.memory (str,
joblib.Memory
interface, optional (default=None)) – memory argument passed tosklearn.pipeline.Pipeline
. If ‘auto’, “project_path/.temp/pipeline” is used.**kwargs (dict) – Additional kwargs to initialize steps (if provided as class).
- Returns
pipeline – Resulted pipeline.
- Return type
-
load
(pipeline, filepath, **kwargs)[source]¶ Load fitted model from disk.
- Parameters
pipeline (
mlshell.Pipeline
) – Pipeline object, will be updated.filepath (str) – Absolute path to load file or relative to ‘project__path’ started with ‘./’.
kwargs (dict) – Additional parameters to pass in load().
- Returns
pipeline – Resulted pipeline.
- Return type
-
info
(pipeline, **kwargs)[source]¶ Log pipeline info.
- Parameters
pipeline (
mlshell.Pipeline
) – Pipeline to explore (if ‘steps’ attribute available).**kwargs (dict) – Additional parameters to pass in low-level functions.
- Returns
pipeline – For compliance with producer logic.
- Return type
-
dict_api
(obj, method='update', **kwargs)¶ Forwarding api for dictionary object.
Could be useful to add/pop keys via configuration steps. For example to proceed update: (‘dict_api’, {‘b’:7} )
-
dump_cache
(obj, prefix=None, cachedir=None, pkg='pickle', **kwargs)¶ Dump intermediate object state to IO.
- Parameters
obj (picklable) – Object to dump.
prefix (str, optional (default=None)) – File identifier, added to filename. If None, ‘self.oid’ is used.
cachedir (str, optional(default=None)) – Absolute path to dump dir or relative to ‘project_path’ started with ‘./’. Created, if not exists. If None, “sproject_path/ .temp/objects” is used.
pkg (str, optional (default='pickle')) – Import package and try
pkg
.dump(obj, file, **kwargs).**kwargs (kwargs) – Additional parameters to pass in .dump().
- Returns
obj – Unchanged input for compliance with producer logic.
- Return type
picklable
-
load_cache
(obj, prefix=None, cachedir=None, pkg='pickle', **kwargs)¶ Load intermediate object state from IO.
- Parameters
obj (picklable) – Object template, for producer logic only (ignored).
prefix (str, optional (default=None)) – File identifier. If None, ‘self.oid’ is used.
pkg (str, optional default('pickle')) – Import package and try obj =
pkg
.load(file, **kwargs).cachedir (str, optional(default=None)) – Absolute path to load dir or relative to ‘project_path’ started with ‘./’. If None, ‘project_path/.temp/objects’ is used.
**kwargs (kwargs) – Additional parameters to pass in .load().
- Returns
obj – Loaded cache.
- Return type
picklable object
-
run
(init, steps)¶ Execute configuration steps.
Consecutive call (with decorators):
init = getattr(self, 'method_id')(init, objects=objects, **kwargs)
- Parameters
init (object) – Will be passed as arg in each step and get back as result.
steps (list of tuples) – List of
self
methods to run consecutive with kwargs: (‘method_id’, kwargs, decorators ).
- Returns
configs – List of configurations, prepared for execution: [(‘section_id__config__id’, config), …].
- Return type
list of tuple
Notes
Object identifier
oid
auto added, if produced object hasoid
attribute.