Wrappers for scikit-learn Data Streams

../../_images/MLPro-Int-scikit-learn-Streams.drawio.png

Ver. 1.5.0 (2024-02-16)

This module provides wrapper functionalities to incorporate public data sets of the Scikit-learn ecosystem.

Learn more: https://scikit-learn.org

class mlpro_int_sklearn.wrappers.streams.WrStreamProviderSklearn(p_logging=True)

Bases: WrapperSklearn, StreamProvider

Wrapper class for Sklearn as StreamProvider.

C_NAME = 'scikit-learn'
_load_utils = ['fetch_20newsgroups()', 'fetch_20newsgroups_vectorized(as_frame=True)', 'fetch_california_housing()', 'fetch_covtype()', 'fetch_rcv1()', 'fetch_kddcup99()', 'load_diabetes()', 'load_iris()', 'load_breast_cancer()', 'load_wine()']
_data_utils = ['clear_data_home', 'dump_svmlight_file']
_datasets = ['20newsgroups', '20newsgroups_vectorized', 'california_housing', 'covtype', 'rcv1', 'kddcup99', 'diabetes', 'iris', 'breast_cancer', 'wine']
_get_stream_list(p_mode=0, p_logging=True, **p_kwargs) list

Custom class to get alist of stream objects from Sklearn

Returns:

list_streams – Returns a list of Streams in Sklearn

Return type:

List

_get_stream(p_id: str = None, p_name: str = None, p_mode=0, p_logging=True, **p_kwargs) Stream

Custom class to fetch an Sklearn stream object.

Parameters:
  • p_id (str) – Optional Id of the requested stream. Default = None.

  • p_name (str) – Optional name of the requested stream. Default = None.

  • p_mode – Operation mode. Default: Mode.C_MODE_SIM.

  • p_logging – Log level (see constants of class Log). Default: Log.C_LOG_ALL.

  • p_kwargs (dict) – Further stream specific parameters.

Returns:

s – Stream object or None in case of an error.

Return type:

Stream

class mlpro_int_sklearn.wrappers.streams.WrStreamSklearn(p_id, p_name, p_num_instances: int = 0, p_version: str = '', p_logging=True, p_mode=0, **p_kwargs)

Bases: Stream

Wrapper class for Streams from Sklearn

Parameters:
  • p_id – Id of the stream.

  • p_name (str) – Name of the stream.

  • p_num_instances (int) – Number of instances in the stream.

  • p_version (str) – Version of the stream. Default = ‘’.

  • p_feature_space (MSpace) – Optional feature space. Default = None.

  • p_label_space (MSpace) – Optional label space. Default = None.

  • p_mode – Operation mode. Valid values are stored in constant C_VALID_MODES.

  • p_logging – Log level (see constants of class Log). Default: Log.C_LOG_ALL.

  • p_kwargs (dict) – Further stream specific parameters.

C_NAME = 'scikit-learn stream'
C_SCIREF_TYPE = 'Online'
_reset()

Custom reset method to download and reset an Sklearn stream.

_setup_feature_space() MSpace

Custom method to set up the feature space of the stream. It is called by method get_feature_space().

Returns:

feature_space – Feature space of the stream.

Return type:

MSpace

_setup_label_space() MSpace

Custom method to set up the label space of the stream. It is called by method get_label_space().

Returns:

label_space – Label space of the stream.

Return type:

MSpace

_download()

Custom download class that assigns the related sklearn dataset and its functionalities to _dataset attribute

_get_next() Instance

Custom method to get the instances one after another sequentially in the Sklearn stream

Returns:

Next instance in the Sklearn stream object (None after the last instance in the dataset).

Return type:

instance