FastCan#
- class fastcan.FastCan(n_features_to_select=1, *, indices_include=None, indices_exclude=None, eta=False, tol=0.01, beam_width=1, verbose=1)#
Forward feature selector according to the sum of squared canonical correlation coefficients (SSC).
- Parameters:
n_features_to_select (int, default=1) – The parameter is the absolute number of features to select.
indices_include (array-like of shape (n_inclusions,), default=None) – The indices of the prerequisite features.
indices_exclude (array-like of shape (n_exclusions,), default=None) – The indices of the excluded features.
eta (bool, default=False) – Whether to use eta-cosine method.
tol (float, default=0.01) –
Tolerance for linear dependence check.
When abs(w.T*x) > tol, the modified Gram-Schmidt is failed as the feature x is linear dependent to the selected features, and mask for that feature will True.
beam_width (int, default=1) –
The beam width for beam search. When beam_width = 1, use greedy search. When beam_width > 1, use beam search.
Added in version 0.5.0.
verbose (int, default=1) – The verbosity level.
- n_features_in_#
Number of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.
- Type:
int
- feature_names_in_#
Names of features seen during fit. Defined only when X has feature names that are all strings.
- Type:
ndarray of shape (n_features_in_,)
- indices_#
The indices of the selected features. The order of the indices is corresponding to the feature selection process.
- Type:
ndarray of shape (n_features_to_select,), dtype=int
- support_#
The mask of selected features.
- Type:
ndarray of shape (n_features,), dtype=bool
- scores_#
The h-correlation/eta-cosine of selected features. The order of the scores is corresponding to the feature selection process.
- Type:
ndarray of shape (n_features_to_select,), dtype=float
- X_transformed_#
Transformed feature matrix. When h-correlation method is used, n_samples_ = n_samples. When eta-cosine method is used, n_samples_ = n_features+n_outputs.
- Type:
ndarray of shape (n_samples_, n_features), dtype=float, order=’F’
- y_transformed_#
Transformed target matrix. When h-correlation method is used, n_samples_ = n_samples. When eta-cosine method is used, n_samples_ = n_features+n_outputs.
- Type:
ndarray of shape (n_samples_, n_outputs), dtype=float, order=’F’
- indices_include_#
The indices of the prerequisite features.
- Type:
ndarray of shape (n_inclusions,), dtype=int
- indices_exclude_#
The indices of the excluded features.
- Type:
array-like of shape (n_exclusions,), dtype=int
References
- Zhang, S., & Lang, Z. Q. (2022).
Orthogonal least squares based fast feature selection for linear classification. Pattern Recognition, 123, 108419.
- Zhang, S., Wang, T., Worden, K., Sun L., & Cross, E. J. (2025).
Canonical-correlation-based fast feature selection for structural health monitoring. Mechanical Systems and Signal Processing, 223, 111895.
Examples
>>> from fastcan import FastCan >>> X = [[1, 0], [0, 1]] >>> y = [1, 0] >>> FastCan(verbose=0).fit(X, y).get_support() array([ True, False])
- fit(X, y)#
Prepare data for h-correlation or eta-cosine methods and select features.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
y (array-like of shape (n_samples, n_outputs)) – Target matrix.
- Returns:
self – Returns the instance itself.
- Return type:
object
- fit_transform(X, y=None, **fit_params)#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters. Pass only if the estimator accepts additional params in its fit method.
- Returns:
X_new – Transformed array.
- Return type:
ndarray array of shape (n_samples, n_features_new)
- get_feature_names_out(input_features=None)#
Mask feature names according to selected features.
- Parameters:
input_features (array-like of str or None, default=None) –
Input features.
If input_features is None, then feature_names_in_ is used as feature names in. If feature_names_in_ is not defined, then the following input feature names are generated: [“x0”, “x1”, …, “x(n_features_in_ - 1)”].
If input_features is an array-like, then input_features must match feature_names_in_ if feature_names_in_ is defined.
- Returns:
feature_names_out – Transformed feature names.
- Return type:
ndarray of str objects
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequestencapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- get_support(indices=False)#
Get a mask, or integer index, of the features selected.
- Parameters:
indices (bool, default=False) – If True, the return value will be an array of integers, rather than a boolean mask.
- Returns:
support – An index that selects the retained features from a feature vector. If indices is False, this is a boolean array of shape [# input features], in which an element is True iff its corresponding feature is selected for retention. If indices is True, this is an integer array of shape [# output features] whose values are indices into the input feature vector.
- Return type:
array
- inverse_transform(X)#
Reverse the transformation operation.
- Parameters:
X (array of shape [n_samples, n_selected_features]) – The input samples.
- Returns:
X_original – X with columns of zeros inserted where features would have been removed by
transform().- Return type:
array of shape [n_samples, n_original_features]
- set_output(*, transform=None)#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
transform ({"default", "pandas", "polars"}, default=None) –
Configure output of transform and fit_transform.
”default”: Default output format of a transformer
”pandas”: DataFrame output
”polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
- transform(X)#
Reduce X to the selected features.
- Parameters:
X (array of shape [n_samples, n_features]) – The input samples.
- Returns:
X_r – The input samples with only the selected features.
- Return type:
array of shape [n_samples, n_selected_features]