diffusion_map

Routines and Class definitions for the diffusion maps algorithm.

class pydiffmap.diffusion_map.DiffusionMap(alpha=0.5, k=64, kernel_type='gaussian', epsilon='bgh', n_evecs=1, neighbor_params=None, metric='euclidean', metric_params=None)[source]

Diffusion Map object to be used in data analysis for fun and profit.

Parameters:
  • alpha (scalar, optional) – Exponent to be used for the left normalization in constructing the diffusion map.
  • k (int, optional) – Number of nearest neighbors over which to construct the kernel.
  • kernel_type (string, optional) – Type of kernel to construct. Currently the only option is ‘gaussian’, but more will be implemented.
  • epsilon (string or scalar, optional) – Method for choosing the epsilon. Currently, the only options are to provide a scalar (epsilon is set to the provided scalar) or ‘bgh’ (Berry, Giannakis and Harlim).
  • n_evecs (int, optional) – Number of diffusion map eigenvectors to return
  • neighbor_params (dict or None, optional) – Optional parameters for the nearest Neighbor search. See scikit-learn NearestNeighbors class for details.
  • metric (string, optional) – Metric for distances in the kernel. Default is ‘euclidean’. The callable should take two arrays as input and return one value indicating the distance between them.
  • metric_params (dict or None, optional) – Optional parameters required for the metric given.

Examples

# setup neighbor_params list with as many jobs as CPU cores and kd_tree neighbor search. >>> neighbor_params = {‘n_jobs’: -1, ‘algorithm’: ‘kd_tree’} # initialize diffusion map object with the top two eigenvalues being computed, epsilon set to 0.1 # and alpha set to 1.0. >>> mydmap = DiffusionMap(n_evecs = 2, epsilon = .1, alpha = 1.0, neighbor_params = neighbor_params)

fit(X, weights=None)[source]

Fits the data.

Parameters:
  • X (array-like, shape (n_query, n_features)) – Data upon which to construct the diffusion map.
  • weights (array-like, optional, shape(n_query)) – Values of a weight function for the data. This effectively adds a drift term equivalent to the gradient of the log of weighting function to the final operator.
Returns:

self (the object itself)

fit_transform(X, weights=None)[source]

Fits the data and returns diffusion coordinates. equivalent to calling dmap.fit(X).transform(x).

Parameters:
  • X (array-like, shape (n_query, n_features)) – Data upon which to construct the diffusion map.
  • weights (array-like, optional, shape (n_query)) – Values of a weight function for the data. This effectively adds a drift term equivalent to the gradient of the log of weighting function to the final operator.
Returns:

phi (numpy array, shape (n_query, n_eigenvectors)) – Transformed value of the given values.

transform(Y)[source]

Performs Nystroem out-of-sample extension to calculate the values of the diffusion coordinates at each given point.

Parameters:Y (array-like, shape (n_query, n_features)) – Data for which to perform the out-of-sample extension.
Returns:phi (numpy array, shape (n_query, n_eigenvectors)) – Transformed value of the given values.