tsdart package

tsdart.dataprocessing module

class tsdart.dataprocessing.Preprocessing(dtype=<class 'numpy.float32'>)

Bases: object

Preprocess the original trajectories to create datasets for training.

Parameters

dtype : dtype, default = np.float32

create_dataset(data, lag_time)

Create the dataset as the input to TS-DART.

Parameters

datalist or ndarray: The original trajectories.
lag_timeint: The lag_time used to create the dataset consisting of time-instant and time-lagged data.

Returns

datasetlist: List of tuples: the length of the list represents the number of data. Each tuple has two elements: one is the instantaneous data frame, the other is the corresponding time-lagged data frame.

transform2pw(data)

Transform xyz coordinates data to pairwise distances data.

Parameters

datalist or ndarray: xyz coordinates data, shape of each traj [num_frames,num_atoms,3].

Returns

pw_datalist or ndarray: Pairwise distances data.

tsdart.loss module

class tsdart.loss.DisLoss(feat_dim, n_states, device, proto_update_factor=0.5, scaling_temperature=0.1)

Bases: Module

Compute dispersion loss.

Parameters

feat_dimint: The dimension of the euclidean space where the latent hypersphere is embedded. The dimension of latent hypersphere is (feat_dim-1).
n_statesint: Number of metastable states to be specified.
devicetorch.device: The device on which the torch modules are executed.
proto_update_factorfloat, default = 0.5: The state center update factor.
scaling_temperaturefloat, default = 0.1: The scaling hyperparameter to compute dispersion loss.

clear(): Clear the list of recorded dispersion losses.

forward(features, labels)

Compute dispersion loss at every call.

Parameters

featurestorch.Tensor: Hyperspherical embeddings of a batch of data.
labelstorch.Tensor: Metastable states of a batch of data.

Returns

losstorch.Tensor: Dispersion loss

output_mean_score()

Output the average of recorded dispersion losses within the list.

Returns

mean_scoretorch.Tensor: The averaged dispersion loss

save(): Save the dispersion loss to the list.

class tsdart.loss.Prototypes(n_states, device, scaling_temperature=0.1)

Bases: Module

Compute the prototypes (state center vectors). Used for evaluating validation data.

Parameters

n_statesint: Number of metastable states to be specified.
devicetorch.device: The device on which the torch modules are executed.
scaling_temperaturefloat, default = 0.1: The scaling hyperparameter to compute dispersion loss.

clear(): Clear the lists of recorded state centers and dispersion losses.

forward(features, labels)

Compute dispersion loss and state center vectors at every call.

Parameters

featurestorch.Tensor: Hyperspherical embeddings of a batch of data.
labelstorch.Tensor: Metastable states of a batch of data.

Returns

prototypestorch.Tensor: State center vectors of shape [n_states, feat_dim].

output_mean_disloss()

Output the average of recorded dispersion losses within the score list.

Returns

mean_disslosstorch.Tensor: The averaged dispersion loss

output_mean_prototypes()

Output the average of recorded state centers within the list.

Returns

mean_prototypestorch.Tensor: The averaged state center vectors

class tsdart.loss.VAMPLoss(epsilon=1e-06, mode='regularize', symmetrized=False)

Bases: Module

Compute VAMP2 loss.

Parameters

epsilonfloat, default = 1e-6: The regularization/trunction parameters for eigenvalues.
modestr, default = ‘regularize’: ‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
symmetrizedboolean, default = False: Whether to symmetrize time-correlation matrices or not.

clear(): Clear the list.

forward(data)

Compute VAMP2 loss at every call.

Parameters

datatuple: Softmax probabilities of batch of transition pairs.

Returns

losstorch.Tensor: VAMP-2 loss

output_mean_score()

Output the average of recorded VAMP2 scores within the list.

Returns

mean_scoretorch.Tensor: The averaged VAMP-2 score

save(): Save the VAMP2 score to the list.

tsdart.model module

class tsdart.model.TSDART(lobe, optimizer='Adam', device=None, learning_rate=0.001, epsilon=1e-06, mode='regularize', symmetrized=False, dtype=<class 'numpy.float32'>, feat_dim=2, n_states=4, proto_update_factor=0.5, scaling_temperature=0.1, beta=0.01, save_model_interval=None, pretrain=0, print=False)

Bases: object

The method used to train TS-DART.

Parameters

datalist or ndarray: The original trajectories.
optimizerstr, default = ‘Adam’: The type of optimizer used for training.
devicetorch.device, default = None: The device on which the torch modules are executed.
learning_ratefloat, default = 1e-3: The learning rate of the optimizer.
epsilonfloat, default = 1e-6: The strength of the regularization/truncation under which matrices are inverted.
modestr, default = ‘regularize’: ‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
symmetrizedboolean, default = False: Whether to symmetrize time-correlation matrices or not.
dtypedtype, default = np.float32: The data type of the input data and the parameters of the model.
feat_dimint, default = 2: The dimension of the euclidean space where the latent hypersphere is embedded. The dimension of latent hypersphere is (feat_dim-1).
n_statesint, default = 4: Number of metastable states to be specified.
proto_update_factorfloat, default = 0.5: The state center update factor.
scaling_temperaturefloat, default = 0.1: The scaling hyperparameter to compute dispersion loss.
betafloat, default = 0.01: The weight of dispersion loss.
save_model_intervalint, default = None: Saving the model every ‘save_model_interval’ epochs.
pretrainint, default = 0: The number of epochs of the pretraining with pure VAMP2 loss.
printboolean, default = False: Whether to print the validation loss every epoch during the training.

fetch_model(): Yields the current model.

fit(train_loader, n_epochs=1, validation_loader=None, progress=<class 'tqdm.std.tqdm'>)

Performs fit on data.

Parameters

train_loadertorch.utils.data.DataLoader: Yield a tuple of batches representing instantaneous and time-lagged samples for training.
n_epochsint, default=1: The number of epochs to use for training. Note that n_epochs should be larger than pretrain.
validation_loadertorch.utils.data.DataLoader, optional, default=None: Yield a tuple of batches representing instantaneous and time-lagged samples for validation.

progress : context manager, default=tqdm

Returns

self : TSDART

partial_fit(data)

Performs partial fit on one batch of data.

Parameters

datatuple: The data containing the a batch of time-instantaneous and a batch of time-lagged data.

Returns

self : TSDART

property training_dis

property training_vamp

validate(val_data)

Evaluate the current model on validation data.

Parameters

val_datatuple: The validation data containing the a batch of time-instantaneous and a batch of time-lagged data.

property validation_dis

property validation_prototypes

property validation_vamp

class tsdart.model.TSDARTEstimator(tsdart_model: TSDARTModel)

Bases: object

The TS-DART estimator the generate the state center vectors and ood scores of original trajectories.

Parameters

tsdart_modelTSDARTModel: The trained TS-DART model.

fit(data)

Fit the TS-DART model with original trajectories to compute OOD scores and state center vectors.

Parameters

datalist or ndarray: The original trajectories.

Returns

self : TSDARTEstimator

property ood_scores

property state_centers

class tsdart.model.TSDARTLayer(layer_sizes: list, n_states: int, scale=1)

Bases: Module

Create TS-DART lobe.

Parameters

layer_sizeslist: The size of each layer of the encoder. The last component should represent the dimension of the euclidean space where the latent hypersphere is embedded.
n_statesint: Number of metastable states to be specified.
scaleint, default = 1: The radius of the hypersphere.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class tsdart.model.TSDARTModel(lobe, device=None, dtype=<class 'numpy.float32'>)

Bases: object

The TS-DART model from TS-DART.

Parameters

lobetorch.nn.Module: TS-DART lobe.
devicetorch device, default = None: The device on which the torch modules are executed.
dtypedtype, default = np.float32: The data type of the input data and the parameters of the model.

property lobe

transform(data, return_type='probs')

Transform the original trajectores to different outputs after training.

Parameters

datalist or ndarray: The original trajectories.
return_typestring: ‘probs’: the softmax probabilties to assign each conformation to a metastable state. ‘states’: the metastable state assignments of each conformation. ‘hypersphere_embs’: the hyperspherical embeddings of each conformation.

Returns

The transformed trajectories.

tsdart.utils module

tsdart.utils.calculate_inverse(matrix, epsilon=1e-06, return_sqrt=False, mode='regularize')

This method can be applied to compute the inverse or the square-root of the inverse of the matrix, this method will be further used to estimate koopman matrix.

Parameters

matrixtorch.Tensor: The matrix to be inverted.
epsilonfloat, default = 1e-6: The regularization/trunction parameters for eigenvalues.
return_sqrtboolean, optional, default = False: If True, the square root of the inverse matrix is returned instead.
modestr, default = ‘regularize’: ‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.

Returns

inversetorch.Tensor: Inverse of the matrix.

tsdart.utils.compute_covariance_matrix(x: Tensor, y: Tensor, remove_mean=True)

This method can be applied to compute the covariance matrix from two batches of data.

Parameters

xtorch.Tensor: The first batch of data of shape [batch_size, num_basis].
ytorch.Tensor: The second batch of data of shape [batch_size, num_basis].
remove_meanboolean, optional, default = True: Whether to remove mean of the data.

Returns

(cov_00, cov_01, cov11)Tuple[torch.Tensor, torch.Tensor, torch.Tensor]: Instantaneous covariance matrix of x, time-lagged covariance matrix of x and y, and instantaneous covariance matrix of y.

tsdart.utils.eig_decomposition(matrix, epsilon=1e-06, mode='regularize')

This method can be applied to do the eig-decomposition for a rank deficient hermetian matrix, this method will be further used to estimate koopman matrix.

Parameters

matrixtorch.Tensor: The hermitian matrix: specifically, the covariance matrix.
epsilonfloat, default = 1e-6: The regularization/trunction parameters for eigenvalues.
modestr, default = ‘regularize’: ‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.

Returns

(eigval, eigvec)Tuple[torch.Tensor, torch.Tensor]: Eigenvalues and eigenvectors.

tsdart.utils.estimate_koopman_matrix(data: Tensor, data_lagged: Tensor, epsilon=1e-06, mode='regularize', symmetrized=False)

This method can be applied to compute the koopman matrix from time-instant and time-lagged data.

Parameters

datatorch.Tensor: The time-instant data of shape [batch_size, num_basis].
data_laggedtorch.Tensor: The time-lagged data of shape [batch_size, num_basis].
epsilonfloat, default = 1e-6: The regularization/trunction parameters for eigenvalues.
modestr, default = ‘regularize’: ‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
symmetrizedboolean, default = False: Whether to symmetrize time-correlation matrices or not.

Returns

koopman_matrixtorch.Tensor: The koopman matrix of shape [num_basis, num_basis].

tsdart.utils.map_data(data, device=None, dtype=<class 'numpy.float32'>)

This function is used to yield the torch.Tensor type data from multiple trajectories.

Parameters

datalist or tuple or ndarray: The trajectories of data.
devicetorch device, default = None: The device on which the torch modules are executed.
dtypedtype, default = np.float32: The data type of the input data and the parameters of the model.

Yields

xtorch.Tensor: The mapped data.

tsdart.utils.set_random_seed(seed): Set a random seed.