minibatch#
- fastcan.minibatch(X, y, n_features_to_select=1, batch_size=1, tol=0.01, verbose=1)#
Feature selection using
fastcan.FastCanwith mini batches.It is suitable for selecting a very large number of features even larger than the number of samples.
The function splits n_features_to_select into n_outputs parts and selects features for each part separately, ignoring the redundancy among outputs. In each part, the function selects features batch-by-batch. The batch size is less than or equal to batch_size. Like correlation filters, which select features one-by-one without considering the redundancy between two features, the function ignores the redundancy between two mini-batches.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
y (array-like of shape (n_samples, n_outputs)) – Target matrix.
n_features_to_select (int, default=1) – The parameter is the absolute number of features to select.
batch_size (int, default=1) – The upper bound of the number of features in a mini-batch. It is recommended that batch_size be less than n_samples.
tol (float, default=0.01) – Tolerance for linear dependence check.
verbose (int, default=1) – The verbosity level.
- Returns:
indices – The indices of the selected features.
- Return type:
ndarray of shape (n_features_to_select,), dtype=int
Examples
>>> from fastcan import minibatch >>> X = [[1, 1, 0], [0.01, 0, 0], [-1, 0, 1], [0, 0, 0]] >>> y = [1, 0, -1, 0] >>> indices = minibatch(X, y, 3, batch_size=2, verbose=0) >>> print(f"Indices: {indices}") Indices: [0 1 2]