HCA

class ims.hca.HCA(dataset=None, affinity='euclidean', linkage='ward')[source]

Bases: object

Hierarchical cluster analysis with scikit-learn AgglomerativeClustering and scipy dendrogram.

Parameters
  • dataset (ims.Dataset, optional) – Dataset with GC-IMS data is needed for sample and label names in dendrogram. If not set uses leaves as labels instead, by default None.

  • affinity (str, optional) – Metric used to compute the linkage. Can be “euclidean”, “l1”, “l2”, “manhattan” or “cosine”. If linkage is set to “ward” only “euclidean” is accepted, by default “euclidean”.

  • linkage (str, optional) – Linkage criterion which determines which distance to use. “ward”, “complete”, “average” or “single” are accepted, by default “ward”.

clustering

Scikit-learn algorithm used for the clustering. See the original documentation for details about attributes.

Type

sklearn.cluster.AgglomerativeClustering

linkage_matrix

Clustering results encoded as linkage matrix.

Type

numpy.ndarray

R

scipy dendrogram output as dictionary.

Type

dict

Example

>>> import ims
>>> ds = ims.Dataset.read_mea("IMS_data")
>>> X, _ = ds.get_xy()
>>> hca = ims.HCA(ds, linkage="ward", affinity="euclidean")
>>> hca.fit(X)
>>> hca.plot_dendrogram()
fit(X)[source]

Fit the model from features.

Parameters

X (array-like of shape (n_samples, n_features)) – Training features to cluster.

plot_dendrogram(width=6, height=6, orientation='right', **kwargs)[source]

Plots clustering results as dendrogram.

Parameters
  • width (int, optional) – Width of the figure in inches, by default 8

  • height (int, optional) – Width of the figure in inches, by default 8

  • orientation (str, optional) – Root position of the clustering tree, by default “right”

  • **kwargs – See scipy.cluster.hierarchy.dendrogram documentation for information about valid keyword arguments.

Return type

matplotlib.pyplot.axes