swectral.SpecPipe#
- class swectral.SpecPipe(spec_exp, space_wait_timeout=36000, reserve_free_pct=5.0)[source]#
Design and implement processing and modeling pipelines on spectral experiment datasets.
- Attributes:
- spec_exp
SpecExp Instance of SpecExp configuring spectral experiment datasets. See
SpecExpfor details.- report_directory
str Root directory where reports are stored. This value is automatically derived from the
report_directoryattribute of the providedspec_expinstance.- space_wait_timeout
int Number of seconds to wait for disk space to become available before raising an error when the disk is full. Default is 36000 (10 hours).
- reserve_free_pct
float Minimum percentage of free disk space required to proceed with processing. Default is 5.0 (5% of total storage capacity).
- process
listoftuple Added process items. Each tuple represents a process definition and contains:
process_id : str process_label : str input_data_level : str output_data_level : str application_sequence : int method : callable full_application_sequence : int alternative_number : int
- process_steps
listoftupleofstr Processes of each pipeline step, each tuple represents a step. The processes are represented in process ID.
- process_chains
listoftupleofstr Generated full-factorial processing chains, each tuple represents a processing chain. The processes are represented in process ID.
- custom_chains
listoftupleofstr Customized subset of the full-factorial
process_chains.- create_time
str Creation date and time of this SpecPipe isntance.
- spec_exp
Methods
add_process(input_data_level, ...[, ...])Add a processing method with defined input/output data levels and application sequence to the pipeline.
ls_process([process_id, process_label, ...])List process items based on filtering conditions.
rm_process([process_id, process_label, ...])Remove process items based on filtering conditions.
add_model(model_method[, model_label, ...])Add a model evaluation process to the processing pipeline.
ls_model([model_id, model_label, ...])List added model evaluation processes based on filtering conditions.
rm_model([model_id, model_label, ...])Remove added model evaluation processes from this SpecPipe instance based on filtering conditions.
process_chains_to_df([stage, print_label, ...])List process chains.
custom_chains_from_df(process_chain_dataframe)Customize processing chains and update chains using a chain dataframe.
custom_chains_to_df([stage, print_label, ...])List customized process chains.
ls_chains([stage, print_label, return_label])List process chains for the pipeline execution.
save_pipe_config([copy, save_spec_exp_config])Save the current pipeline configuration files to the root of the report directory.
load_pipe_config([config_file_path])Load SpecPipe configuration from a dill file.
test_run([test_modeling, return_result, ...])Run the pipeline of all processing chains using simplified test data.
preprocessing([n_processor, resume, ...])Run preprocessing steps of all processing chains on the entire dataset and output modeling-ready sample_list data to files.
assembly([n_processor, resume, dump_backup, ...])Apply assembly process to introduce cross-sample interactions prior to modeling.
model_evaluation([n_processor, resume, ...])Evaluate added models using processed sample data generated by all preprocessing chains.
run([result_directory, n_processor, ...])Run entire pipelines of specified processes of this
SpecPipeinstance on providedSpecExpinstance.Retrieve summary of generated reports in the console.
Retrieve major model evaluation reports of every processing chain in the console.
See also
Examples
Create a
SpecPipeinstance using a preparedSpecExpinstanceexp:>>> pipe = SpecPipe(exp)
Methods
__init__(spec_exp[, space_wait_timeout, ...])add_model(model_method[, model_label, ...])Add a model evaluation process to the processing pipeline.
add_process(input_data_level, ...[, ...])Add a processing method with defined input/output data levels and application sequence to the pipeline.
assembly([n_processor, resume, dump_backup, ...])Apply assembly process to introduce cross-sample interactions prior to modeling.
build_pipeline(step_methods)Build pipelines by given structure and methods of each step.
custom_chains_from_df(process_chain_dataframe)Customize processing chains and update chains using a chain dataframe.
custom_chains_to_df([stage, print_label, ...])List customized process chains.
load_config([config_file_path])Load SpecPipe configuration from a dill file.
load_pipe_config([config_file_path])Load SpecPipe configuration from a dill file.
ls_chains([stage, print_label, return_label])List process chains for the pipeline execution.
ls_custom_chains([stage, print_label, ...])List customized process chains.
ls_model([model_id, model_label, ...])List added model evaluation processes based on filtering conditions.
ls_process([process_id, process_label, ...])List process items based on filtering conditions.
ls_process_chains([stage, print_label, ...])List process chains.
model_evaluation([n_processor, resume, ...])Evaluate added models using processed sample data generated by all preprocessing chains.
preprocessing([n_processor, resume, ...])Run preprocessing steps of all processing chains on the entire dataset and output modeling-ready sample_list data to files.
process_chains_to_df([stage, print_label, ...])List process chains.
Retrieve major model evaluation reports of every processing chain in the console.
Retrieve summary of generated reports in the console.
rm_model([model_id, model_label, ...])Remove added model evaluation processes from this SpecPipe instance based on filtering conditions.
rm_process([process_id, process_label, ...])Remove process items based on filtering conditions.
run([result_directory, n_processor, ...])Run entire pipelines of specified processes of this
SpecPipeinstance on providedSpecExpinstance.save_config([copy, save_spec_exp_config])Save the current pipeline configuration files to the root of the report directory.
save_pipe_config([copy, save_spec_exp_config])Save the current pipeline configuration files to the root of the report directory.
test_run([test_modeling, return_result, ...])Run the pipeline of all processing chains using simplified test data.
update_spec_exp(spec_exp)Attributes
- property space_wait_timeout#
- property reserve_free_pct#
- property report_directory#
- property spec_exp#
- property process#
- property process_steps#
- property process_chains#
- property custom_chains#
- property create_time#
- add_model(model_method, model_label='', input_data_level=None, test_error_raise=True, is_regression=None, validation_method='2-fold', unseen_threshold=0.0, x_shape=None, result_backup=False, data_split_config='default', validation_config='default', metrics_config='default', roc_plot_config='default', scatter_plot_config='default', residual_config='default', residual_plot_config='default', influence_analysis_config='default', save_application_model=True)[source]#
Add a model evaluation process to the processing pipeline.
The added model operates on 1D data (data level
7/"spec1d") and producing model-level output (data level9/"model"). All models share a unified application sequence within the pipeline.- Parameters:
- model_method
object Sklearn-style model object.
Regression models must implement
fitandpredict. Classification models must additionally implementpredict_proba.- model_label
str,optional Custom label for the added model.
If empty string, a label is automatically generated.
- input_data_level
intorstr Input data level for the process. Choose between:
7or"spec1d"If the callable is applied to 1D array-like sample spectra or flattened data, such as ROI spectral statistics.
8or"assembly"If the
methodis a model instance or secondary assembly function and applied following any custom assembly processes.
If None, the data level will be automatically determined according to the availablity of
"assembly"process. Default is None.See
add_processfor more details.- test_error_raisebool,
optional Whether to raise error when the model fails in validation using simplified mock data before added to the pipeline.
If True, an exception is raised, otherwise only a warning is issued. Default is True.
- is_regressionbool,
optional Whether the model is a regression model.
If None, the model type is inferred from sample target values. Default is None.
- validation_method
str,optional Validation strategy for model evaluation. Supported formats include:
"loo"for leave-one-out cross-validation"k-fold"(e.g."5-fold") for k-fold cross-validation"m-n-split"(e.g."70-30-split") for train-test split
Default is
"2-fold".- unseen_threshold
float,optional Classification-only parameter.
If the highest predicted class probability of a sample is below this threshold, the sample is assigned to an unknown class. Default is 0.0.
- x_shape
tupleofint,optional Expected shape of independent variables for models requiring structured input. Default is None.
Currently ignored.
- result_backupbool,
optional Whether to save timestamped backup copies of result files. Default is False.
- data_split_config
strordict,optional Additional data splitting configuration.
If a dictionary of parameters is provided, it may include:
random_stateintRandom state for splitting and shuffling.
Default is
"default", which uses the default data splitting behavior.- validation_config
strordict,optional Validation behavior configuration.
If a dictionary of parameters is provided, it may include:
unseen_thresholdfloatIf an unseen class for the training data exists, a test sample is predicted to the unseen class if the predicted probabilities of seen classes is lower than this threshold. Default is 0 (Only predict seen classes).
use_original_shapeboolWhether data shape is applied for the model. Currently no use. Default is False.
save_fold_modelboolWhether models of the validation folds are saved to files. Default is True.
save_fold_databoolWhether models of the validation folds are saved to files. Default is True.
Default is
"default", which uses the default validation behavior.- metrics_config
strordictorNone,optional Metrics computation configuration.
If None, metric computation is skipped. Default is
"default". Currently only"default"is supported.- roc_plot_config
strordictorNone,optional Receiver Operating Characteristic (ROC) plotting configuration for classification models.
No use for regression models.
If None, ROC plot generation is skipped.
If a dictionary of parameters is provided, it may include:
plot_titlestrtitle of the ROC plot. Default is ‘ROC Curve’.
title_sizeint or floatfont size of the plot title. Default is 26.
title_padint or float or Nonepadding between the title and the plot. Default is None.
figure_sizetuple of 2 (float or int)figure size as (width, height). Default is (8, 8).
plot_margintuple of 4 floatplot margins as (left, right, top, bottom). Default is (0.15, 0.95, 0.9, 0.13).
plot_line_widthint or floatline width of the ROC curve. Default is 3.
plot_line_alphafloatalpha value of the ROC curve line. Default is 0.8.
diagnoline_widthint or floatline width of the diagonal reference line. Default is 3.
x_axis_limittuple of 2 (float or int) or Nonex-axis limits as (min, max). Default is None.
x_axis_labelstrlabel of the x-axis. Default is ‘False Positive Rate’.
x_axis_label_sizeint or floatfont size of the x-axis label. Default is 26.
x_tick_sizeint or floatfont size of x-axis tick labels. Default is 24.
x_tick_numberintnumber of x-axis ticks. Default is 6.
y_axis_limittuple of 2 (float or int) or Noney-axis limits as (min, max). Default is None.
y_axis_labelstrlabel of the y-axis. Default is ‘True Positive Rate’.
y_axis_label_sizeint or floatfont size of the y-axis label. Default is 26.
y_tick_sizeint or floatfont size of y-axis tick labels. Default is 24.
y_tick_numberintnumber of y-axis ticks. Default is 6.
axis_line_size_leftint or float or Noneline width of the left axis spine. Default is 1.5.
axis_line_size_rightint or float or Noneline width of the right axis spine. Default is 1.5.
axis_line_size_topint or float or Noneline width of the top axis spine. Default is 1.5.
axis_line_size_bottomint or float or Noneline width of the bottom axis spine. Default is 1.5.
legendboolwhether to display the legend. Default is True.
legend_locationstrlegend location string accepted by matplotlib. Default is ‘lower right’.
legend_fontsizeint or floatfont size of legend entries. Default is 20.
legend_titlestrlegend title text. Default is empty.
legend_title_fontsizeint or floatfont size of the legend title. Default is 24.
background_gridboolwhether to show a background grid. Default is False.
show_plotboolwhether to display the plot interactively. Default is False.
Default is
"default", which uses the default plotting behavior.- scatter_plot_config
strordictorNone,optional Scatter plot configuration for regression models.
If None, scatter plot generation is skipped.
If a dictionary of parameters is provided, it may include:
plot_titlestrplot title text. Default is ‘’.
title_sizeint or floatfont size of the plot title. Default is 26.
title_padint or float or Nonepadding between the title and the plot. Default is None.
figure_sizetuple of 2 (float or int)figure size in inches as (width, height). Default is (8, 8).
plot_margintuple of 4 floatplot margins as (left, right, top, bottom). Default is (0.2, 0.95, 0.95, 0.15).
plot_line_widthint or floatline width of plotted curves. Default is 3.
point_sizeint or floatsize of plotted points. Default is 120.
point_colorstrcolor of plotted points. Default is ‘firebrick’.
point_alphafloattransparency of plotted points. Default is 0.7.
x_axis_limittuple of 2 (float or int) or Nonelimits of the x-axis. Default is None.
x_axis_labelstrlabel of the x-axis. Default is ‘Predicted target values’.
x_axis_label_sizeint or floatfont size of the x-axis label. Default is 26.
x_tick_valueslist of int or float or Noneexplicit tick values for the x-axis. Default is None.
x_tick_sizeint or floatfont size of x-axis ticks. Default is 24.
x_tick_numberintnumber of x-axis ticks. Default is 5.
y_axis_limittuple of 2 (float or int) or Nonelimits of the y-axis. Default is None.
y_axis_labelstrlabel of the y-axis. Default is ‘Residuals’.
y_axis_label_sizeint or floatfont size of the y-axis label. Default is 26.
y_tick_valueslist of int or float or Noneexplicit tick values for the y-axis. Default is None.
y_tick_sizeint or floatfont size of y-axis ticks. Default is 24.
y_tick_numberintnumber of y-axis ticks. Default is 5.
axis_line_size_leftint or float or Noneline width of the left axis spine. Default is 1.0.
axis_line_size_rightint or float or Noneline width of the right axis spine. Default is 1.5.
axis_line_size_topint or float or Noneline width of the top axis spine. Default is 1.5.
axis_line_size_bottomint or float or Noneline width of the bottom axis spine. Default is 1.5.
background_gridboolwhether to display background grid lines. Default is False.
show_plotboolwhether to display the plot immediately. Default is False.
Default is
"default", which uses the default plotting behavior.- residual_config
strordictorNone,optional Residual analysis configuration.
If None, residual analysis is skipped. Default is
"default", which uses the default residual analysis behavior.- residual_plot_config
strordictorNone,optional Residual plot configuration for regression models.
If None, residual plot generation is skipped.
If a dictionary of parameters is provided, the available parameters are same as
scatter_plot_config.Default is
"default", which uses the default plotting behavior.- influence_analysis_config
strordictorNone,optional Influence analysis configuration. When enabled, computes a Cook’s distance–like influence measure for each sample using a Leave-One-Out (LOO) approach.
If None, influence analysis is skipped.
Note: This computation can be very time-consuming for large datasets. For such cases, consider using a simple validation method or setting this option to None.
If a dictionary of parameters is provided, it may include:
validation_methodbool, optionalwhether to use independent validation for leave-one-out influence analysis.
random_stateint or None, optionalrandom state for data splitting.
Default is
"default", which uses the default influence analysis behavior.- save_application_modelbool,
optional Whether application model is trained on all data and stored in the chain report. Default is True.
- model_method
- Return type:
See also
Examples
Create a
SpecPipeinstance from an existing SpecExp object:>>> pipe = SpecPipe(exp)
Add a model with a specified validation method:
>>> from sklearn.neighbors import KNeighborsClassifier >>> knn = KNeighborsClassifier(n_neighbors=3) >>> pipe.add_model(knn, validation_method="5-fold")
Use different validation strategies:
>>> pipe.add_model(knn, validation_method="60-40-split") >>> pipe.add_model(knn, validation_method="loo")
- ls_model(model_id=None, model_label=None, model_method=None, *, exact_match=True, print_result=True, return_result=False)[source]#
List added model evaluation processes based on filtering conditions.
If a filter criterion is not provided or None, the corresponding filter is not applied.
- Parameters:
- model_id
str,optional Model evaluation process ID. Default is None.
- model_label
str,optional Custom model label. Default is None.
- model_method
strorobject,optional Model object or method. Default is None.
- exact_matchbool,
optional If False, any process with a property value containing the specified value is included. Default is True.
- print_resultbool,
optional If True, simplified results are printed. Default is True.
- return_resultbool,
optional If True, a complete resulting DataFrame is returned. Default is False.
- model_id
- Returns:
pandas.DataFrameorNoneIf
return_result=True, returns a pandas DataFrame of matched model evaluation processes.If
return_result=False, returns None.
- Return type:
Optional[DataFrame]
Examples
For a prepared
SpecExpinstanceexp:>>> pipe = SpecPipe(exp) >>> from sklearn.neighbors import KNeighborsClassifier >>> knn = KNeighborsClassifier(n_neighbors=3) >>> pipe.add_model(knn, validation_method='2-fold')
List all models:
>>> pipe.ls_model()
Return model items as a DataFrame:
>>> df = pipe.ls_model(return_result=True)
Filter results by model label:
>>> pipe.ls_model(model_label='KNeighbor', exact_match=False)
- rm_model(model_id=None, model_label=None, model_method=None, exact_match=True)[source]#
Remove added model evaluation processes from this SpecPipe instance based on filtering conditions.
If a filter criterion is not provided or None, the corresponding filter is not applied.
- Parameters:
- model_id
str,optional Model evaluation process ID. Default is None.
- model_label
str,optional Custom model label. Default is None.
- model_method
strorobject,optional Method object or model. Default is None.
- exact_matchbool,
optional If False, any process with a property value containing the specified value is removed. Default is True.
- model_id
- Return type:
Examples
For a prepared
SpecExpinstanceexp:>>> pipe = SpecPipe(exp) >>> from sklearn.neighbors import KNeighborsClassifier >>> knn = KNeighborsClassifier(n_neighbors=3) >>> pipe.add_model(knn, validation_method='2-fold')
Remove all models:
>>> pipe.rm_model()
Remove a specific model:
>>> pipe.rm_model(model_label='KNeighbor')
- add_process(input_data_level, output_data_level, application_sequence, method, process_label='', *, test_error_raise=True, is_regression=None, validation_method='2-fold', unseen_threshold=0.0, x_shape=None, result_backup=False, data_split_config='default', validation_config='default', metrics_config='default', roc_plot_config='default', scatter_plot_config='default', residual_config='default', residual_plot_config='default', influence_analysis_config='default', save_application_model=True)[source]#
Add a processing method with defined input/output data levels and application sequence to the pipeline. A processing method can be a preprocessing function or a model for evaluation.
- Parameters:
- input_data_level
intorstr Input data level for the process. Available options:
0or"image"If the callable is applied to raster images. The corresponding callable must accept the input raster path as the first argument and the output path as the second argument.
1or"pixel_spec"If the callable is applied to 1D spectra of each pixel.
2or"pixel_specs_array"If the callable is applied to 2D
numpy.ndarrayof pixel spectra. Each row is a pixel spectrum.3or"pixel_specs_tensor"If the callable is applied to 3D
torch.tensor(Shape:(C, H, W)), with computation performed along axis 0.4or"pixel_hyperspecs_tensor"If the callable is applied to 3D hyperspectral
torch.tensor(Shape:(C, H, W)), with computation performed along axis 1.5or"image_roi"If the callable is applied to a region of interest (ROI) within a raster image. The callable must receive the raster path and ROI coordinates provided by provided
SpecExpinstance.6or"roi_specs"If the callable is applied to 2D
numpy.ndarrayof ROI spectra, each row is a pixel.7or"spec1d"If the callable is applied to 1D array-like sample spectra or flattened data, such as ROI spectral statistics.
For assembly methods, the callable must instead accept a list of sample records. Each element must have the structure:
( sample_id, sample_label, validation_group, test_mask, train_mask, original_shape, target_value, predictors, )
where:
sample_id : str
sample_label : str
validation_group : str
test_mask : numpy.int8
train_mask : numpy.int8
original_shape : tuple of int
target_value : Any
predictors : array-like of shape (n_features,)
For model methods, this 1D array-like data is automatically assembled across samples, and the model must accept a 2D
numpy.ndarrayof shape (n_samples, n_features).8or"assembly"If the
methodis a model instance or secondary assembly function and applied following any custom assembly processes.
Note specific to this parameter:
Input data levels
0through4share a single, commonapplication_sequencescheme. These data levels do not maintain independent application sequence series, in contrast to input data levels5,6, and7.For example, a process defined with input data level
0("image") andapplication_sequence=0and a process defined with input data level2("pixel_specs_array") andapplication_sequence=0are treated as parallel operations within the same image-processing step.- output_data_level
intorstr Output data level. Available options:
0or"image"If the callable returns raster image path.
1or"pixel_spec"Same as input “pixel_spec”.
2or"pixel_specs_array"Same as input “pixel_specs_array”.
3or"pixel_specs_tensor"Same as input “pixel_specs_tensor”.
4or"pixel_hyperspecs_tensor"Same as input “pixel_hyperspecs_tensor”.
5or"image_roi"Currently unavailable.
6or"roi_specs"If the callable returns 2D
numpy.ndarrayof ROI spectra.7or"spec1d"If the callable returns 1D array-like spectral data.
8or"assembly"If the callable is a sample assembly function for global sample processing with cross-sample interactions before modeling.
The callable must return a list of sample records with structure and typing identical to its input, although values of tuple elements may differ.
See
"spec1d"underinput_data_levelfor the required record schema.9or"model"Used for modeling, accepts
"spec1d"or"assembly"input.
- application_sequence
int Sequence number of the method within the same input data level. Lower numbers execute first.
- method
callable()orobjectorlistof(callable()orobject) Processing method, a processing function or sklearn-style estimator.
The callable or estimator must accept inputs and produce outputs that conform to the configured data levels.
If an estimator is provided, it must follow the sklearn estimator interface, implementing fit and predict, and predict_proba for classifiers.
A list of methods may be provided to specify multiple methods that share the same configuration.
- process_label
strorlistofstr,optional Custom label(s) for the process.
If provided:
If a single method is provided, must be a single str of label.
If multiple methods are provided, must be a list of str with a length equal to the number of methods.
Default is an empty string, which automatically generates label(s) using the Callable name(s) or the estimator class name(s).
- test_error_raisebool,
optional Whether to raise error when the process fails in validation using simplified mock data before added to the pipeline.
If True, an exception is raised, otherwise only a warning is issued. Default is True.
- is_regressionbool,
optional Whether the model is a regression model. See
add_modelfor details.- validation_method
str,optional Validation strategy for model evaluation. See
add_modelfor details.- unseen_threshold
float,optional Classification-only parameter. See
add_modelfor details.- x_shape
tupleofint,optional Expected shape of independent variables for models requiring structured input. Currently ignored. See
add_modelfor details.- result_backupbool,
optional Whether to save timestamped backup copies of result files. See
add_modelfor details.- data_split_config
strordict,optional Additional data splitting configuration. See
add_modelfor details.- validation_config
strordict,optional Validation behavior configuration. See
add_modelfor details.- metrics_config
strordictorNone,optional Metrics computation configuration. See
add_modelfor details.- roc_plot_config
strordictorNone,optional Receiver Operating Characteristic (ROC) plotting configuration for classification models. See
add_modelfor details.- scatter_plot_config
strordictorNone,optional Scatter plot configuration for regression models. See
add_modelfor details.- residual_config
strordictorNone,optional Residual analysis configuration. See
add_modelfor details.- residual_plot_config
strordictorNone,optional Residual plot configuration for regression models. See
add_modelfor details.- influence_analysis_config
strordictorNone,optional Influence analysis configuration. See
add_modelfor details.- save_application_modelbool,
optional Influence analysis configuration. See
add_modelfor details.
- input_data_level
- Return type:
Examples
For prepared
SpecExpinstanceexp:>>> pipe = SpecPipe(exp)
Add an image processor accepting image path and returning processed path:
>>> pipe.add_process('image', 'image', 0, img_processor)
Or using numeric level indices:
>>> pipe.add_process(0, 0, 0, img_processor)
Customize method name:
>>> pipe.add_process(0, 0, 0, img_processor, process_label='img_proc')
Apply function to pixel spectra array:
>>> from swectral.functions import snv >>> pipe.add_process('pixel_specs_array', 'pixel_specs_array', 0, snv)
GPU processing example:
>>> from swectral.functions import snv_hyper >>> pipe.add_process(4, 4, 0, snv_hyper)
Denoiser on ROI spectra:
>>> from swectral.denoiser import LocalPolynomial >>> pipe.add_process(6, 6, 0, LocalPolynomial(5, polynomial_order=2).savitzky_golay_filter)
Process 1D sample spectra:
>>> pipe.add_process(7, 7, 0, LocalPolynomial(5, polynomial_order=2).savitzky_golay_filter)
- ls_process(process_id=None, process_label=None, input_data_level=None, output_data_level=None, application_sequence=None, method=None, full_application_sequence=None, *, exact_match=True, print_result=True, return_result=False)[source]#
List process items based on filtering conditions. If a filter criterion is
None, the corresponding filter is not applied.- Parameters:
- process_id
str,optional Process ID. The default is
None.- process_label
str,optional Custom process label. The default is
None.- input_data_level
strorint,optional Input data level of the process.
See
add_processfor available options. The default isNone.- output_data_level
strorint,optional Output data level of the process.
See
add_processfor available options. The default isNone.- application_sequence
intortupleofint,optional Exact sequence number or a sequence number range within a data level.
Ranges must be specified as a tuple. The default is
None.- method
strorcallable()orobject,optional Method function, method name, or method object. The default is
None.- full_application_sequence
intortupleofint,optional Exact sequence number or a sequence number range within the entire pipeline. Ranges must be specified as a tuple. The default is
None.- exact_matchbool,
optional If False, processes whose property values partially match the specified value are included. The default is
True.- print_dfbool,
optional Whether to print simplified matched process items. The default is
True.- return_dfbool,
optional Whether to return a dataframe of matched process items. The default is
False.
- process_id
- Returns:
pandas.DataFrameorNoneIf
return_df=True, returns a pandas DataFrames of matched process items.If
return_df=False, returns None.
- Return type:
Optional[DataFrame]
See also
Examples
For prepared
SpecExpinstanceexp:>>> pipe = SpecPipe(exp) >>> from swectral.functions import snv >>> pipe.add_process(2, 2, 0, snv)
List all added processes:
>>> pipe.ls_process()
List processes by input data level:
>>> pipe.ls_process(input_data_level=2)
List processes by output data level:
>>> pipe.ls_process(output_data_level=2)
List processes by method:
>>> pipe.ls_process(method='snv')
List processes by partial method name:
>>> pipe.ls_process(method='nv', exact_match=False)
Return results instead of printing:
>>> df_process = pipe.ls_process(print_df=False, return_df=True)
- rm_process(process_id=None, process_label=None, input_data_level=None, output_data_level=None, application_sequence=None, method=None, exact_match=True)[source]#
Remove process items based on filtering conditions. If a filter criterion is not provided, the corresponding filter is not applied.
- Parameters:
- process_id
str,optional Process ID. The default is None.
- process_label
str,optional Custom process label. The default is None.
- input_data_level
strorint,optional Input data level of the process. See
add_processfor available options. The default is None.- output_data_level
strorint,optional Output data level of the process. See
add_processfor available options. The default is None.- application_sequence
intortupleofint,optional Exact sequence number or a sequence number range within a data level. Ranges must be specified as a tuple. The default is None.
- method
strorcallable()orobject,optional Method function, method name, or method object. The default is None.
- full_application_sequence
intortupleofint,optional Exact sequence number or a sequence number range within the entire pipeline. Ranges must be specified as a tuple. The default is None.
- exact_matchbool,
optional If False, processes whose property values partially match the specified value are included. The default is True.
- print_dfbool,
optional Whether to print simplified matched process items. The default is True.
- return_dfbool,
optional Whether to return a dataframe of matched process items. The default is False.
- process_id
- Return type:
See also
Examples
For prepared
SpecExpinstanceexp:>>> pipe = SpecPipe(exp) >>> from swectral.functions import snv >>> pipe.add_process(2, 2, 0, snv)
Remove all added processes:
>>> pipe.rm_process()
Remove processes by input data level:
>>> pipe.rm_process(input_data_level=2)
Remove processes by output data level:
>>> pipe.rm_process(output_data_level=2)
Remove processes by method:
>>> pipe.rm_process(method='snv')
- build_pipeline(step_methods)[source]#
Build pipelines by given structure and methods of each step.
This method constructs one or more processing pipelines directly from an explicit structural description. Each pipeline step is defined by its input/output data levels with one or more alternative callable(s) or objects responsible for processing at that step.
- Parameters:
- step_methods
listof((strorint,strorint),callable()orobjectorlistof(callable()orobject)ordictofstrto(callable()orobject),NoneordictofstrtoAny) A list describing the pipeline structure and the processing logic for each step. Each element of the list has the form:
((input_data_level, output_data_level), methods, params)
where:
input_data_levelint or strInput data level in number or name. See
add_processfor details.output_data_levelint or strInput data level in number or name. See
add_processfor details.methodscallable or object or list or dictA single callable or object defining one processing method. A list of callables or objects representing alternative methods for the step. A dictionary mapping method names to callables or objects, allowing multiple named methods.
paramsdict, optionalOptional dictionary of additional parameters applied to the methods at the step.
- step_methods
- Return type:
See also
Examples
For an initialized SpecPipe instance
pipe:>>> from swectral import roi_mean >>> from swectral.functions import snv, minmax, aucnorm >>> pipe.build_pipeline( ... [ ... ((2, 2), [snv, minmax, aucnorm]), ... ((5, 7), roi_mean), ... ((7, 8), {'RF': RandomForestRegressor(n_estimators=6), 'KNN': KNeighborsRegressor(n_neighbors=3)}, {'validation_method': '5-fold'}) ... ] ... )
- ls_process_chains(stage=None, print_label=True, return_label=False)[source]#
List process chains. Returns the default full-factorial process chains.
Returns a dataframe where each row represents a processing chain with process IDs. For custom chains, use
ls_custom_chains.- Parameters:
- stage
strorNone,optional Processing stage, choose between:
- ``None``: list entire processing chains. - ``preprocessing``: list unique preprocessing stage of the processing chains. - ``assembly``: list unique assembly stage of the processing chains. - ``model`` or ``modeling``: list unique preprocessing stage of the processing chains.
Default is
None.- print_labelbool,
optional If True, prints chains using chain label. Default is True.
- return_labelbool,
optional If True, returns an additional dataframe of process labels. Default is False.
- stage
- Returns:
pandas.DataFrameortupleofpandas.DataFrameorNoneIf
return_label=False, returns apandas.DataFrameof process chains in process IDs.If
return_label=True, returns a tuple of 2pandas.DataFrameof process chains in IDs and labels.If no process is added to this SpecPipe instance, returns None.
- Return type:
See also
Notes
This method is also available as
process_chains_to_df.Examples
For prepared
SpecPipeinstancepipe:>>> pipe.ls_process_chains()
Or equivalent:
>>> pipe.process_chains_to_df()
Return label display in addition to process ID display:
>>> pipe.ls_process_chains(return_label=True)
- process_chains_to_df(stage=None, print_label=True, return_label=False)#
List process chains. Returns the default full-factorial process chains.
Returns a dataframe where each row represents a processing chain with process IDs. For custom chains, use
ls_custom_chains.- Parameters:
- stage
strorNone,optional Processing stage, choose between:
- ``None``: list entire processing chains. - ``preprocessing``: list unique preprocessing stage of the processing chains. - ``assembly``: list unique assembly stage of the processing chains. - ``model`` or ``modeling``: list unique preprocessing stage of the processing chains.
Default is
None.- print_labelbool,
optional If True, prints chains using chain label. Default is True.
- return_labelbool,
optional If True, returns an additional dataframe of process labels. Default is False.
- stage
- Returns:
pandas.DataFrameortupleofpandas.DataFrameorNoneIf
return_label=False, returns apandas.DataFrameof process chains in process IDs.If
return_label=True, returns a tuple of 2pandas.DataFrameof process chains in IDs and labels.If no process is added to this SpecPipe instance, returns None.
- Return type:
See also
Notes
This method is also available as
process_chains_to_df.Examples
For prepared
SpecPipeinstancepipe:>>> pipe.ls_process_chains()
Or equivalent:
>>> pipe.process_chains_to_df()
Return label display in addition to process ID display:
>>> pipe.ls_process_chains(return_label=True)
- custom_chains_from_df(process_chain_dataframe)[source]#
Customize processing chains and update chains using a chain dataframe.
Once custom chains are created, SpecPipe will prioritize their execution, bypassing the original full-factorial chains.
- Parameters:
- process_chain_dataframepandas.DataFrame-like
A process chain dataframe.
Must be a subset of the original full-factorial chains, and each chain must be complete. Columns must be [‘Step_1’, ‘Step_2’, …] and the length must match the column length of
process_chains. And all values must be valid process IDs of SpecPipe.It is recommended to modify the dataframe obtained from
ls_process_chainsorprocess_chains_to_dfto construct a customized process chain dataframe. Users must retrieve the complete processing chain DataFrame by leavingstageat its default value or explicitly setting it toNone.
- Return type:
See also
Examples
For prepared
SpecPipeinstancepipe:>>> df_chain = pipe.process_chains_to_df()
After modification, load the modified dataframe:
>>> pipe.custom_chains_from_df(df_chain_modified)
- ls_custom_chains(stage=None, print_label=True, return_label=False)[source]#
List customized process chains.
Returns a dataframe where each row represents a processing chain with process IDs.
- Parameters:
- stage
strorNone,optional Processing stage, choose between:
- ``None``: list entire processing chains. - ``preprocessing``: list unique preprocessing stage of the processing chains. - ``assembly``: list unique assembly stage of the processing chains. - ``model`` or ``modeling``: list unique preprocessing stage of the processing chains.
Default is
None.- print_labelbool,
optional If True, prints chains using chain label. Default is True.
- return_labelbool,
optional If True, returns an additional dataframe of process labels. Default is False.
- stage
- Returns:
pandas.DataFrameorNoneIf
return_label=False, returns apandas.DataFrameof process chains in process IDs.If
return_label=True, returns a tuple of 2pandas.DataFrameof process chains in IDs and labels.If no custom chain is specified in this SpecPipe instance, returns None.
- Return type:
See also
Notes
This method is also available as
custom_chains_to_df.Examples
For prepared
SpecPipeinstancepipe:>>> df_chain = pipe.ls_custom_chains()
- custom_chains_to_df(stage=None, print_label=True, return_label=False)#
List customized process chains.
Returns a dataframe where each row represents a processing chain with process IDs.
- Parameters:
- stage
strorNone,optional Processing stage, choose between:
- ``None``: list entire processing chains. - ``preprocessing``: list unique preprocessing stage of the processing chains. - ``assembly``: list unique assembly stage of the processing chains. - ``model`` or ``modeling``: list unique preprocessing stage of the processing chains.
Default is
None.- print_labelbool,
optional If True, prints chains using chain label. Default is True.
- return_labelbool,
optional If True, returns an additional dataframe of process labels. Default is False.
- stage
- Returns:
pandas.DataFrameorNoneIf
return_label=False, returns apandas.DataFrameof process chains in process IDs.If
return_label=True, returns a tuple of 2pandas.DataFrameof process chains in IDs and labels.If no custom chain is specified in this SpecPipe instance, returns None.
- Return type:
See also
Notes
This method is also available as
custom_chains_to_df.Examples
For prepared
SpecPipeinstancepipe:>>> df_chain = pipe.ls_custom_chains()
- ls_chains(stage=None, print_label=True, return_label=False)[source]#
List process chains for the pipeline execution.
Returns custom process chains if they are specified; otherwise, returns the default full-factorial process chains.
Returns a dataframe where each row represents a processing chain with process IDs.
- Parameters:
- stage
strorNone,optional Processing stage, choose between:
- ``None``: list entire processing chains. - ``preprocessing``: list unique preprocessing stage of the processing chains. - ``assembly``: list unique assembly stage of the processing chains. - ``model`` or ``modeling``: list unique preprocessing stage of the processing chains.
Default is
None.- print_labelbool,
optional If True, prints chains using chain label. Default is True.
- return_labelbool,
optional If True, returns an additional dataframe of process labels. Default is False.
- stage
- Returns:
pandas.DataFrameorNoneIf
return_label=False, returns apandas.DataFrameof process chains in process IDs.If
return_label=True, returns a tuple of 2pandas.DataFrameof process chains in IDs and labels.If no custom chain is specified in this SpecPipe instance, returns None.
- Return type:
See also
Examples
For created
SpecPipeinstancepipe:>>> df_chain = pipe.ls_chains()
- save_pipe_config(copy=False, save_spec_exp_config=True)[source]#
Save the current pipeline configuration files to the root of the report directory.
- Parameters:
- Return type:
Notes
This method is also available as
save_config.Examples
For a created SpecPipe instance
pipe:>>> pipe.save_pipe_config()
Or equivalently:
>>> pipe.save_config()
Save a backup copy as well:
>>> pipe.save_pipe_config(copy=True)
- save_config(copy=False, save_spec_exp_config=True)#
Save the current pipeline configuration files to the root of the report directory.
- Parameters:
- Return type:
Notes
This method is also available as
save_config.Examples
For a created SpecPipe instance
pipe:>>> pipe.save_pipe_config()
Or equivalently:
>>> pipe.save_config()
Save a backup copy as well:
>>> pipe.save_pipe_config(copy=True)
- load_pipe_config(config_file_path='')[source]#
Load SpecPipe configuration from a dill file.
- Parameters:
- config_file_path
str,optional Path to the SpecPipe configuration dill file.
Can be a file path or the file name in the report directory of this SpecPipe instance.
If not provided or empty, the path will be:
(SpecPipe.spec_exp.report_directory)/SpecPipe_configuration/SpecPipe_pipeline_configuration_created_at_(SpecExp.create_time).dill.Default is empty string.
- config_file_path
- Return type:
See also
Notes
This method is also available as
load_config.Examples
For a created SpecPipe instance
pipe:>>> pipe.save_pipe_config()
Load from the default configuration path:
>>> pipe.load_pipe_config()
Or equivalently:
>>> pipe.load_config()
Load from a custom configuration file path:
>>> pipe.load_pipe_config("/pipe_config.dill")
- load_config(config_file_path='')#
Load SpecPipe configuration from a dill file.
- Parameters:
- config_file_path
str,optional Path to the SpecPipe configuration dill file.
Can be a file path or the file name in the report directory of this SpecPipe instance.
If not provided or empty, the path will be:
(SpecPipe.spec_exp.report_directory)/SpecPipe_configuration/SpecPipe_pipeline_configuration_created_at_(SpecExp.create_time).dill.Default is empty string.
- config_file_path
- Return type:
See also
Notes
This method is also available as
load_config.Examples
For a created SpecPipe instance
pipe:>>> pipe.save_pipe_config()
Load from the default configuration path:
>>> pipe.load_pipe_config()
Or equivalently:
>>> pipe.load_config()
Load from a custom configuration file path:
>>> pipe.load_pipe_config("/pipe_config.dill")
- test_run(test_modeling=True, return_result=False, model_test_coverage=1.0, assembly_test_coverage=1.0, dump_result=True, dump_backup=False, save_preprocessed_images=False, num_type=<class 'numpy.float32'>)[source]#
Run the pipeline of all processing chains using simplified test data. This method is executed automatically prior to each formal run.
- Parameters:
- test_modelingbool,
optional Whether added models are tested. If False, only the first chain is tested. The default is True.
- return_resultbool,
optional Whether results of the processes are returned. If True, results of all tested steps are returned in a list. The default is False.
- model_test_coverage
float,optional Fraction of modeling pipelines to test. Set to a value < 1.0 to reduce test runtime by randomly sampling preprocessing results without replacement.
If 1.0, all pipelines will be tested.
If < 1.0, only the specified fraction of pipelines will be tested.
Default is 1.0.
- assembly_test_coverage
float,optional Fraction of assembly pipelines to test. Ignored if no assembly process is configured. Set to a value < 1.0 to reduce test runtime by randomly sampling preprocessing results without replacement.
If 1.0, all pipelines will be tested.
If < 1.0, only the specified fraction of pipelines will be tested.
Default is 1.0.
- dump_resultbool,
optional Whether test results are stored in the chains. The default is True.
- dump_backupbool,
optional Whether backup of the step results is stored. The backup file is named with the datetime of dumping.
- num_type: str or type, optional
Numeric data type for array-like data storage, supporting numeric
numpydata types. Default isnumpy.float32.
- test_modelingbool,
- Returns:
- Return type:
Examples
For a created SpecPipe instance
pipe:>>> pipe.test_run()
- preprocessing(n_processor=1, resume=False, result_directory='', num_type=<class 'numpy.float32'>, dump_backup=False, step_result=False, to_csv=True, show_progress=True, save_config=True, summary=True, geo_reference_warning=False, skip_test=False, check_space=True)[source]#
Run preprocessing steps of all processing chains on the entire dataset and output modeling-ready sample_list data to files.
- Parameters:
- result_directory
str,optional Directory for storing the preprocessing results. If not provided, the
report_directoryattribute of thisSpecPipeinstance is used.For consistency with subsequent pipeline stages, using the default location is strongly recommended.
- n_processor
int Number of processors to use in preprocessing.
Default is
1(parallel processing is not applied).Windows note: when using
n_processor > 1on Windows, all excecutable code in the working script must be placed within:if __name__ == '__main__':
- num_type: str or type, optional
Numeric data type for array-like data storage, supporting numeric
numpydata types. Default isnumpy.float32.- dump_backupbool,
optional Create backup files of results with timestamp. Default is False.
- step_resultbool,
optional Whether to retain intermediate results for each processing chain.
If False, intermediate results are discarded immediately after processing.
If True, intermediate results are preserved. This may require substantial additional storage during processing.
Default is False.
- resumebool
If True, computation resumes from preprocessing progress logs. Apply
resumeto avoid repeated preprocessing after interruption. Default is False.- to_csvbool
If True, final preprocessing results are also saved to CSV files in addition to
dillfiles. Default is True.- show_progressbool,
optional Show processing progress. Default is True.
- save_configbool,
optional Save
SpecPipeconfigurations. Default is True.- summarybool,
optional Whether to summarize preprocessed data and target values. Default is True.
- geo_reference_warningbool,
optional Whether to suppress GeoReferenceWarning. If False, the warning is suppressed. Default is False.
- skip_testbool,
optional Whether to skip test execution completely. Test execution validates every processing chain and serves as a safeguard against runtime errors in long formal execution. Default is False.
- check_spacebool,
optional Whether to validate available disk space against the estimated output size. If True, an error is raised when the estimate exceeds the available space. If False, a warning is issued instead. Default is True.
- result_directory
- Return type:
See also
Examples
For created
SpecPipeinstancepipe:>>> pipe.preprocessing()
Pipeline-level multiprocessing:
>>> pipe.preprocessing(n_processor=10)
- assembly(n_processor=1, resume=False, dump_backup=False, step_result=False, show_progress=True)[source]#
Apply assembly process to introduce cross-sample interactions prior to modeling. This stage is skipped automatically if no assembly process have been added.
These operations directly modify the processed sample data and may alter both the composition of the sample set and the internal representation of individual samples.
- Parameters:
- n_processor
int Number of processors to use in preprocessing.
Default is
1(parallel processing is not applied).Windows note: when using
n_processor > 1on Windows, all excecutable code in the working script must be placed within:if __name__ == '__main__':
- resumebool
If True, computation resumes from preprocessing progress logs. Apply
resumeto avoid repeated preprocessing after interruption. Default is False.- dump_backupbool,
optional Create backup files of results with timestamp. Default is False.
- step_resultbool,
optional Whether to retain intermediate results for each processing chain.
If False, intermediate results are discarded immediately after processing.
If True, intermediate results are preserved. This may require substantial additional storage during processing.
Default is False.
- show_progressbool,
optional Show processing progress. Default is True.
- n_processor
- Return type:
Examples
For created
SpecPipeinstancepipe:>>> pipe.preprocessing() >>> pipe.assembly() >>> pipe.model_evaluation()
Pipeline-level multiprocessing:
>>> pipe.assembly(n_processor=10)
- model_evaluation(n_processor=1, resume=False, report_directory='', show_progress=True, save_config=True, summary=True, multitest_correction='fdr_bh', check_space=True)[source]#
Evaluate added models using processed sample data generated by all preprocessing chains. Modeling and evaluation behavior is configured when models are added to the pipeline.
The method automatically summarizes group-level statistics and computes marginal model performance metrics for alternative method options at each processing step.
- Parameters:
- n_processor
int Number of processors to use in preprocessing.
Default is
1(parallel processing is not applied).Windows note: when using
n_processor > 1on Windows, all excecutable code in the working script must be placed within:if __name__ == '__main__':
- resumebool
If True, computation resumes from preprocessing progress logs.
Use
resumeto avoid repeated preprocessing after interruption. Default is False.- result_directory
str,optional Directory for storing the model evaluation results. If not provided, the
report_directoryattribute of thisSpecPipeinstance is used.For consistency with subsequent pipeline stages, using the default location is strongly recommended.
- show_progressbool,
optional Show processing progress. Default is True.
- save_configbool,
optional Save
SpecPipeconfigurations. Default is True.- summarybool,
optional Whether to summarize overall and marginal performance. Marginal performance metrics at each processing step is compared using the Mann–Whitney U test. Default is True.
- multitest_correction: str or None
Method used for adjustment of significance test p-values. See
statsmodels.stats.multitest.multipletestsfor available options.- check_spacebool,
optional Whether to validate available disk space against the estimated output size. If True, an error is raised when the estimate exceeds the available space. If False, a warning is issued instead. Default is True.
- n_processor
- Return type:
See also
add_processadd_modelbuild_pipelinepreprocessingassemblyrungroupstats.performance_metrics_summarygroupstats.performance_marginal_statsmodelconnector.combined_model_marginal_stats
Examples
For a prepared
SpecPipeinstancepipe:>>> pipe.preprocessing() >>> pipe.model_evaluation()
Pipeline-level multiprocessing:
>>> pipe.model_evaluation(n_processor=10)
- run(result_directory='', n_processor=-1, num_type=<class 'numpy.float32'>, test_model=True, model_parallel=True, dump_backup=False, step_result=False, resume=False, resume_modeling=True, sample_data_to_csv=True, show_progress=True, save_config=True, summary=True, multitest_correction='fdr_bh', geo_reference_warning=False, model_test_coverage=1.0, assembly_test_coverage=1.0, skip_test=False, check_space=True)[source]#
Run entire pipelines of specified processes of this
SpecPipeinstance on providedSpecExpinstance. Processes are configured using methodadd_processandadd_model.- Parameters:
- result_directory
str,optional Directory to save preprocessing and model evaluation reports. Default is the
report_directoryof the inputSpecExpinstance of thisSpecPipeinstance.- n_processor
int,optional Number of processors to use during pipeline execution.
Default is -1, which does not apply parallel execution on Windows and applies parallel execution using (maximum available CPUs - 1) processors on other operating systems.
Set to -2 to force (maximum available CPUs - 1) processors on Windows.
Windows note: when using
n_processor > 1orn_processor = -2on Windows, all excecutable code in the working script must be placed within:if __name__ == '__main__':
- num_type: str or type, optional
Numeric data type for array-like data storage, supporting numeric
numpydata types. Default isnumpy.float32.- test_modelbool,
optional Whether to test added models before formal execution.
If False, model testing is skipped. Tests use minimal sample sizes, which may cause errors for some models. Default is True.
- model_parallelbool,
optional Whether to enable pipeline-level parallelism during modeling.
Set to False when the modeling method already uses multiprocessing or GPU acceleration to avoid nested parallel execution. Default is True.
- dump_backupbool,
optional Whether to create timestamped backup files of results. Default is False.
- step_resultbool,
optional Whether to retain intermediate results for each processing chain.
If False, intermediate results are discarded immediately after processing.
If True, intermediate results are preserved. This may require substantial additional storage during processing.
Default is False.
- resumebool,
optional Whether to resume execution from the last saved preprocessing checkpoint.
This avoids redundant processing after interruptions. Default is False.
- resume_modelingbool,
optional If True, resume modeling; otherwise, rebuild and re-evaluate all models.
Effective only if
resume=True, ignored ifresume=FalseDefault is True.
- sample_data_to_csvbool,
optional Whether to additionally save preprocessed sample data as CSV files. Default is True.
- show_progressbool,
optional Whether to display execution progress. Default is True.
- save_configbool,
optional Whether to save SpecPipe configuration files. Default is True.
- summarybool,
optional Whether to summarize preprocessed data, performance metrics, and marginal performance metrics. Marginal performance at each step is compared using the Mann–Whitney U test. Default is True.
- multitest_correction: str or None
Method used for adjustment of significance test p-values. See
statsmodels.stats.multitest.multipletestsfor available options.- geo_reference_warningbool,
optional Whether to suppress GeoReferenceWarning messages. If False, warnings are suppressed. Default is False.
- skip_testbool,
optional Whether to skip test execution entirely. Test execution validates all processing chains and serves as a safeguard against runtime errors during long executions. Default is False.
- check_spacebool,
optional Whether to validate available disk space against the estimated output size. If True, an error is raised when the estimate exceeds the available space. If False, a warning is issued instead. Default is True.
- result_directory
- Return type:
See also
Examples
For a prepared
SpecPipeinstancepipe:>>> pipe.run()
Pipeline-level multiprocessing:
>>> pipe.run(n_processor=10)
Automatically determine CPU usage:
>>> pipe.run(n_processor=-1)
Windows multiprocessing:
>>> if __name__ == '__main__': ... pipe.run(n_processor=10) >>> if __name__ == '__main__': ... pipe.run(n_processor=-2)
- report_summary()[source]#
Retrieve summary of generated reports in the console. The summary includes performance summary and marginal performances among added processes of each pipeline step.
- Returns:
dictA dictionary of pandas DataFrames with the following contents:
For regression:
Performance summary.
Marginal R2 of the steps with multiple processes.
For classification:
Macro- and micro-average performance summary.
Marginal macro- and micro-average AUC of the steps with multiple processes.
- Return type:
Examples
For
SpecPipeinstancepipeafter running:>>> result_summary = pipe.report_summary()
- report_chains()[source]#
Retrieve major model evaluation reports of every processing chain in the console.
- Returns:
listofdictEach dictionary contains the reports of one processing chain.
For regression pipelines, the reports include:
Processes of the chain
Validation results
Performance metrics
Residual analysis
Influence analysis (if available)
Scatter plot
Residual plot
For classification pipelines, the reports include:
Processes of the chain
Validation results
Performance metrics
Residual analysis
Influence analysis (if available)
ROC curves
- Return type:
Examples
For
SpecPipeinstancepipeafter running:>>> chain_results = pipe.report_chains()