swectral.modelcombiners.create_bagging_model#
- swectral.modelcombiners.create_bagging_model(base_estimator, n_estimators=50, max_samples=1.0, replace_sample=True, oversampling=False, feature_subset=None, replace_feature=False, random_state=None, regressor_aggregate='mean', limit_proba=None, is_classifier=None, name=None)[source]#
Create a bagging model instance from specified base_estimator.
- Parameters:
- base_estimator
object Any estimator implementing
fitandpredictfollowing the scikit-learn API. If the estimator implementspredict_proba, the ensemble will operate in classification mode.- n_estimators
int,optional Number of base estimators to train in the ensemble. Default is 20.
- max_samples
float,optional Fraction of the training samples to draw for each base estimator. Must be in the interval
(0, 1]. Default is 1.- replace_samplebool,
optional Whether sampling is performed with replacement. If
False, sampling is performed without replacement. Default is True.- oversamplingbool,
optional Whether to apply oversampling for rare cases in the training data.
For categorical targets, rare classes are upsampled to reduce class imbalance.
For continuous targets, underrepresented target regions are upsampled. The target space is divided into adaptive bins, with a maximum of 10 bins.
Default is False.
- feature_subset
str,float,int,orNone Strategy for selecting a subset of features for each base estimator. Options are:
"sqrt": Use the square root of the total number of features."log": Use log2 of the total number of features.float between 0 and 1 : Use this fraction of the total features.
int : Use this exact number of features (must be positive).
None : Use all features, no resampling is applied.
If resampled, features are selected randomly according to the specified strategy. Default is None.
- replace_featurebool
Whether feature resampling is performed with replacement. If
False, feature resampling is performed without replacement. Default is False.- random_state
intorNone,optional Seed used by the random number generator for reproducible bootstrap sampling. Default is None.
- regressor_aggregate: str, optional
Aggregate type for regressors. Choose between:
"mean": Use the average of base estimator predictions."median": Use the median of base estimator predictions.tuple of two float: Use a trimmed mean, keeping only predictions within the given quantile range (e.g., (0.1, 0.9)).
Default is “mean”.
- limit_proba: None or tuple of two float, optional
Limit probability in ensemble. Any probability from base models will be restricted to this range. If None, no limit of probability is applied. Default is
None.- is_classifierbool or
None,optional Whether the base estimator should be treated as a classifier.
If
None, the ensemble will automatically detect the type by inspecting the base estimator for attributes_estimator_typeorclasses_, or methodpredict_proba. Default is None.- name
strorNone,optional Name of the created model class. If None, the class name is
'Bagging<BaseEstimatorClassName>'. Default is None.
- base_estimator
- Returns:
objectAn bagging model instance with a sklearn-style model interface.
- Return type:
See also
Examples
Basic Usage:
from sklearn.cross_decomposition import PLSRegression model = create_bagging_model( base_estimator=PLSRegression(n_components=5), n_estimators=100 ) model.fit(X_train, y_train) y_pred = model.predict(X_test)