PLSR¶
- class ims.plsr.PLSR(dataset, n_components=2, **kwargs)[source]¶
Bases:
object
Applies a scikit-learn PLSRegression to GC-IMS data and provides prebuilt plots as well as a feature selection via variable importance in projection (VIP) scores.
See the scikit-learn documentation for more details: https://scikit-learn.org/stable/modules/generated/sklearn.cross_decomposition.PLSRegression.html
- Parameters
dataset (ims.Dataset) – Needed for the retention and drift time coordinates in the plots.
n_components (int, optional) – Number of components to keep, by default 2.
kwargs (optional) – Additional key word arguments are passed on to the scikit-learn PLSRegression.
- x_scores¶
X scores.
- Type
numpy.ndarray of shape (n_samples, n_components)
- y_scores¶
y scores.
- Type
numpy.ndarray of shape (n_samples, n_components)
- x_weights¶
The left singular vectors of the cross-covariance matrices of each iteration.
- Type
numpy.ndarray of shape (n_features, n_components)
- y_weights¶
The right singular vectors of the cross-covariance matrices of each iteration.
- Type
numpy.ndarray of shape (n_targets, n_components)
- x_loadings¶
The loadings of X. When scaling was applied on the dataset, corrects the loadings using the weights.
- Type
numpy.ndarray of shape (n_features, n_components)
- y_loadings¶
The loadings of y.
- Type
numpy.ndarray of shape (n_targes, n_components)
- coefficients¶
The coefficients of the linear model.
- Type
numpy.ndarray of shape (n_features, n_targets)
- y_pred_train¶
Stores the predicted values from the training data for the plot method.
- Type
numpy.ndarray
Example
>>> import ims >>> import pandas as pd >>> ds = ims.Dataset.read_mea("IMS_data") >>> responses = pd.read_csv("responses.csv") >>> ds.labels = responses >>> X_train, X_test, y_train, y_test = ds.train_test_split() >>> model = ims.PLSR(ds, n_components=5) >>> model.fit(X_train, y_train) >>> model.predict(X_test, y_test) >>> model.plot()
- fit(X_train, y_train)[source]¶
Fits the model with training data.
- Parameters
X_train (numpy.ndarray of targets (n_samples, n_features)) – Training vectors with features.
y_train (numpy.ndarray of shape (n_samples, n_targets)) – Target vectors with response variables.
- Returns
Fitted model.
- Return type
self
- plot(annotate=False)[source]¶
Plots predicted vs actual values and shows regression line. Recommended to predict with test data first.
- annotatebool, optional
If True annotates plot with sample names, by default False.
- Return type
matplotlib.pyplot.axes
- plot_coefficients(width=6, height=6)[source]¶
Plots PLS coefficients as image with retention and drift time axis.
- Parameters
width (int or float, optional) – Width of the plot in inches, by default 8.
height (int or float, optional) – Height of the plot in inches, by default 8.
- Return type
matplotlib.pyplot.axes
- plot_loadings(component=1, color_range=0.01, width=6, height=6)[source]¶
Plots PLS x loadings as image with retention and drift time coordinates.
- Parameters
component (int, optional) – Component to plot, by default 1.
color_range (float, optional) – Minimum and maximum to adjust to different scaling methods, by default 0.02.
width (int or float, optional) – Width of the plot in inches, by default 8.
height (int or float, optional) – Height of the plot in inches, by default 8.
- Return type
matplotlib.pyplot.axes
- plot_selectivity_ratio(threshold=None, width=6, height=6)[source]¶
Plots VIP scores as image with retention and drift time axis.
- Parameters
threshold (int) – Only plots VIP scores above threshold if set. Values below are displayed as 0, by default None.
width (int or float, optional) – Width of the plot in inches, by default 8.
height (int or float, optional) – Height of the plot in inches, by default 8.
- Return type
matplotlib.pyplot.axes
- plot_vip_scores(threshold=None, width=6, height=6)[source]¶
Plots VIP scores as image with retention and drift time axis.
- Parameters
threshold (int) – Only plots VIP scores above threshold if set. Values below are displayed as 0, by default None.
width (int or float, optional) – Width of the plot in inches, by default 8.
height (int or float, optional) – Height of the plot in inches, by default 8.
- Return type
matplotlib.pyplot.axes
- Raises
ValueError – If VIP scores have not been calculated prior.
- predict(X_test, y_test=None)[source]¶
Predicts responses for features of the test data.
- Parameters
X_test (numpy.ndarray of shape (n_samples, n_features)) – Features of test data.
y_train (numpy.ndarray of shape (n_samples, n_targets), optional) – True labels for test data. If set allows automatic plotting of validation data, by default None.
- Returns
Predicted responses for test data.
- Return type
numpy.ndarray of shape (n_samples, n_targets)
- score(X_test, y_test, sample_weight=None)[source]¶
Calculates R^2 score score for predicted data.
- Parameters
X_test (numpy.ndarray of shape (n_samples, n_features)) – Feature vectors of the test data.
y_test (numpy.ndarray of shape (n_samples, n_targets)) – True regression responses.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights, by default None.
- Returns
score – R^2 score.
- Return type
float