SIMCA¶

class ims.simca.MultiClassSIMCA(models=[])[source]¶

Bases: object

Builds soft independent modelling of class analogies model out off OneClassSIMCA models for each class.

Parameters:: models (list) – OneClassSIMCA model per class.

nclasses¶

Number of OneClasSIMCA models in models.

Type:: int

classnames¶

Names of targets in OneClasSIMCA models.

Type:: list

res_df¶

Binary representation of classification. Calculated in the predict method.

Type:: pd.DataFrame of shape (n_targets, n_samples)

Raises:: ValueError – If instance was not fitted in ‘OneClassSimca’ class.

Example

>>> import numpy as np
>>> from sklearn.datasets import load_iris
>>> from sklearn.utils import resample
>>> from sklearn.metrics import accuracy_score
>>>
>>> # load example data
>>> X, y = load_iris(return_X_y=True)
>>>
>>> # instantiate a model and fit target data
>>> model = MultiClassSIMCA()
>>> model.fit(X, y, n_components=2, q=0.95)
>>>
>>> # make prediction on test data and calculate accuracy
>>> y_pred = model.predict(X_test)
>>> y_true = y_test == target
>>> print(accuracy_score(y_true, y_pred))
>>>
>>> # visualize results
>>> model.plot()

fit(X_train, y_train, n_components=2, p=0.95)[source]¶

Fit OneClasSIMCA models with training data.

Parameters:

X_train (numpy.ndarray of shape (n_samples, n_features)) – Training data as feature matrix.
y_train (numpy.array of shape(n_samples,)) – True class labels for training data.
n_components (int, optional) – Number of PCA components, by default 2.
q (float, optional) – False discovery rate, by default 0.95.

plot()[source]¶

Visualizes the classification of test data.

Return type:: matplotlib.pyplot.axes

predict(X_test, y_test=None)[source]¶

Applies fitted SIMCA models to test data and makes a prediction about the class memberships.

Parameters:

X_test (numpy.ndarray of shape (n_samples, n_features)) – Test data as feature matrix.
y_test (numpy.array of shape(n_samples,), optional) – True class labels for test data, by default None

Return type:

pd.DataFrame of shape (n_targets, n_samples)

class ims.simca.OneClassSIMCA(n_components=2, q=0.95)[source]¶

Bases: object

Implements soft independed modelling of class analogies (SIMCA) for one target class.

Parameters:

n_components (int, optional) – Number of PCA components, by default 2.
q (float, optional) – q-value: false discovery rate, by default 0.95.

n_components¶

Parameter set during initialization.

Type:: int

q¶

Parameter set during initialization.

Type:: float

pca¶

The underlying PCA model.

Type:: sklearn.decomposition.PCA

target¶

Name of the target class. Parameter set with ‘fit’ method.

Type:: str

Q_target¶

Q residuals of target class. Calculated in ‘fit’ method.

Type:: numpy.ndarray of shape (n_samples,)

Q_conf¶

Q confidence limit of target class. Calculated in ‘fit’ method.

Type:: float

Q_test¶

Q residuals of test samples. Calculated in ‘predict’ method.

Type:: numpy.ndarray of shape (n_samples,)

Tsq_target¶

T square values of target class. Calculated in ‘fit’ method.

Type:: numpy.ndarray of shape (n_samples,)

Tsq_conf¶

T square confidence limit of target class. Calculated in ‘fit’ method.

Type:: float

Tsq_test¶

T square of test samples. Calculated in ‘predict’ method.

Type:: numpy.ndarray of shape (n_samples,)

Example

>>> import numpy as np
>>> from sklearn.datasets import load_iris
>>> from sklearn.utils import resample
>>> from sklearn.metrics import accuracy_score
>>>
>>> # load example data and set target (class label)
>>> X, y = load_iris(return_X_y=True)
>>> target = 0
>>>
>>> # Find target data to train model and draw random test data
>>> X_target = X[np.where(y == target)]
>>> X_test, y_test = resample(X, y n_samples=50)
>>>
>>> # instantiate a model and fit target data
>>> model = OneClassSIMCA(n_components=2, q=0.95)
>>> model.fit(X_target, target)
>>>
>>> # make prediction on test data and calculate accuracy
>>> y_pred = model.predict(X_test)
>>> y_true = y_test == target
>>> print(accuracy_score(y_true, y_pred))
>>>
>>> # visualize results
>>> model.plot(hue=y_test)

Tsq_Q_plot(hue=None, annotate=None)[source]¶

Visualizes fitted SIMCA model as T square Q plot. Test data is included when the ‘predict’ method was used prior.

Parameters:

hue (iterable, optional) – Iterable with test data class labels to color markers by class. Only labels for test data is needed, labels for target class are already known. Ignores if ‘predict’ method was not used prior, by default None.
annotate (iterable, optional) – Iterable with sample names to annotate markers with, by default None.

Return type:

matplotlib.pyplot.axes

Raises:

ValueError – If instance was not fitted before calling ‘plot’.

fit(X_target, target)[source]¶

Fit SIMCA model with data from target class.

Parameters:

X_target (numpy.ndarray of shape (n_samples, n_features)) – Training data for target class.
target (str) – Name of the target class as identifier of the model and for plotting.

plot_loadings(dataset, PC=1, color_range=0.1, width=6, height=6)[source]¶

Plots loadings of a principle component with the original retention and drift time coordinates.

Parameters:

dataset (ims.Dataset) – The dataset is needed for the retention and drift time coordinates.
PC (int, optional) – principal component, by default 1.
color_range (int, optional) – color_scale ranges from - color_range to + color_range centered at 0.
width (int or float, optional) – plot width in inches, by default 9.
height (int or float, optional) – plot height in inches, by default 10.

Return type:

matplotlib.pyplot.axes

predict(X_test, decision_rule='both')[source]¶

Applies fitted SIMCA model to test data and makes a prediction about the target class membership.

Parameters:

X_test (numpy.ndarray of shape (n_samples, n_features)) – Test data as feature matrix.
decision_rule (str, optional) – Prediction based on either ‘Q’, ‘Tsq’ or ‘both’, by default “both”.

Returns:

Boolean result array.

Return type:

numpy.ndarray of shape (n_samples,)

Raises:

ValueError – If invalid value is given for decision_rule argument.

scores_plot(y_train, y_test, x_comp=1, y_comp=2)[source]¶

Visualizes scores of fitted SIMCA model as Scores Plot.

Parameters:¶

y_trainnumpy.array of shape(n_samples,): True class labels for training data.
y_testnumpy.array of shape(n_samples,): True class labels for test data.
x_compint, optional: Component x axis, by default 1.
y_compint, optional: Component y axis, by default 2.

rtype:: matplotlib.pyplot.axes