tsdart package

tsdart.dataprocessing module

class tsdart.dataprocessing.Preprocessing(dtype=<class 'numpy.float32'>)

Bases: object

Preprocess the original trajectories to create datasets for training.

Parameters

dtype : dtype, default = np.float32

create_dataset(data, lag_time)

Create the dataset as the input to TS-DART.

Parameters

datalist or ndarray

The original trajectories.

lag_timeint

The lag_time used to create the dataset consisting of time-instant and time-lagged data.

Returns

datasetlist

List of tuples: the length of the list represents the number of data. Each tuple has two elements: one is the instantaneous data frame, the other is the corresponding time-lagged data frame.

transform2pw(data)

Transform xyz coordinates data to pairwise distances data.

Parameters

datalist or ndarray

xyz coordinates data, shape of each traj [num_frames,num_atoms,3].

Returns

pw_datalist or ndarray

Pairwise distances data.

tsdart.loss module

class tsdart.loss.DisLoss(feat_dim, n_states, device, proto_update_factor=0.5, scaling_temperature=0.1)

Bases: Module

Compute dispersion loss.

Parameters

feat_dimint

The dimension of the euclidean space where the latent hypersphere is embedded. The dimension of latent hypersphere is (feat_dim-1).

n_statesint

Number of metastable states to be specified.

devicetorch.device

The device on which the torch modules are executed.

proto_update_factorfloat, default = 0.5

The state center update factor.

scaling_temperaturefloat, default = 0.1

The scaling hyperparameter to compute dispersion loss.

clear()

Clear the list of recorded dispersion losses.

forward(features, labels)

Compute dispersion loss at every call.

Parameters

featurestorch.Tensor

Hyperspherical embeddings of a batch of data.

labelstorch.Tensor

Metastable states of a batch of data.

Returns

losstorch.Tensor

Dispersion loss

output_mean_score()

Output the average of recorded dispersion losses within the list.

Returns

mean_scoretorch.Tensor

The averaged dispersion loss

save()

Save the dispersion loss to the list.

class tsdart.loss.Prototypes(n_states, device, scaling_temperature=0.1)

Bases: Module

Compute the prototypes (state center vectors). Used for evaluating validation data.

Parameters

n_statesint

Number of metastable states to be specified.

devicetorch.device

The device on which the torch modules are executed.

scaling_temperaturefloat, default = 0.1

The scaling hyperparameter to compute dispersion loss.

clear()

Clear the lists of recorded state centers and dispersion losses.

forward(features, labels)

Compute dispersion loss and state center vectors at every call.

Parameters

featurestorch.Tensor

Hyperspherical embeddings of a batch of data.

labelstorch.Tensor

Metastable states of a batch of data.

Returns

prototypestorch.Tensor

State center vectors of shape [n_states, feat_dim].

output_mean_disloss()

Output the average of recorded dispersion losses within the score list.

Returns

mean_disslosstorch.Tensor

The averaged dispersion loss

output_mean_prototypes()

Output the average of recorded state centers within the list.

Returns

mean_prototypestorch.Tensor

The averaged state center vectors

class tsdart.loss.VAMPLoss(epsilon=1e-06, mode='regularize', symmetrized=False)

Bases: Module

Compute VAMP2 loss.

Parameters

epsilonfloat, default = 1e-6

The regularization/trunction parameters for eigenvalues.

modestr, default = ‘regularize’

‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.

symmetrizedboolean, default = False

Whether to symmetrize time-correlation matrices or not.

clear()

Clear the list.

forward(data)

Compute VAMP2 loss at every call.

Parameters

datatuple

Softmax probabilities of batch of transition pairs.

Returns

losstorch.Tensor

VAMP-2 loss

output_mean_score()

Output the average of recorded VAMP2 scores within the list.

Returns

mean_scoretorch.Tensor

The averaged VAMP-2 score

save()

Save the VAMP2 score to the list.

tsdart.model module

class tsdart.model.TSDART(lobe, optimizer='Adam', device=None, learning_rate=0.001, epsilon=1e-06, mode='regularize', symmetrized=False, dtype=<class 'numpy.float32'>, feat_dim=2, n_states=4, proto_update_factor=0.5, scaling_temperature=0.1, beta=0.01, save_model_interval=None, pretrain=0, print=False)

Bases: object

The method used to train TS-DART.

Parameters

datalist or ndarray

The original trajectories.

optimizerstr, default = ‘Adam’

The type of optimizer used for training.

devicetorch.device, default = None

The device on which the torch modules are executed.

learning_ratefloat, default = 1e-3

The learning rate of the optimizer.

epsilonfloat, default = 1e-6

The strength of the regularization/truncation under which matrices are inverted.

modestr, default = ‘regularize’

‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.

symmetrizedboolean, default = False

Whether to symmetrize time-correlation matrices or not.

dtypedtype, default = np.float32

The data type of the input data and the parameters of the model.

feat_dimint, default = 2

The dimension of the euclidean space where the latent hypersphere is embedded. The dimension of latent hypersphere is (feat_dim-1).

n_statesint, default = 4

Number of metastable states to be specified.

proto_update_factorfloat, default = 0.5

The state center update factor.

scaling_temperaturefloat, default = 0.1

The scaling hyperparameter to compute dispersion loss.

betafloat, default = 0.01

The weight of dispersion loss.

save_model_intervalint, default = None

Saving the model every ‘save_model_interval’ epochs.

pretrainint, default = 0

The number of epochs of the pretraining with pure VAMP2 loss.

printboolean, default = False

Whether to print the validation loss every epoch during the training.

fetch_model()

Yields the current model.

fit(train_loader, n_epochs=1, validation_loader=None, progress=<class 'tqdm.std.tqdm'>)

Performs fit on data.

Parameters

train_loadertorch.utils.data.DataLoader

Yield a tuple of batches representing instantaneous and time-lagged samples for training.

n_epochsint, default=1

The number of epochs to use for training. Note that n_epochs should be larger than pretrain.

validation_loadertorch.utils.data.DataLoader, optional, default=None

Yield a tuple of batches representing instantaneous and time-lagged samples for validation.

progress : context manager, default=tqdm

Returns

self : TSDART

partial_fit(data)

Performs partial fit on one batch of data.

Parameters

datatuple

The data containing the a batch of time-instantaneous and a batch of time-lagged data.

Returns

self : TSDART

property training_dis
property training_vamp
validate(val_data)

Evaluate the current model on validation data.

Parameters

val_datatuple

The validation data containing the a batch of time-instantaneous and a batch of time-lagged data.

property validation_dis
property validation_prototypes
property validation_vamp
class tsdart.model.TSDARTEstimator(tsdart_model: TSDARTModel)

Bases: object

The TS-DART estimator the generate the state center vectors and ood scores of original trajectories.

Parameters

tsdart_modelTSDARTModel

The trained TS-DART model.

fit(data)

Fit the TS-DART model with original trajectories to compute OOD scores and state center vectors.

Parameters

datalist or ndarray

The original trajectories.

Returns

self : TSDARTEstimator

property ood_scores
property state_centers
class tsdart.model.TSDARTLayer(layer_sizes: list, n_states: int, scale=1)

Bases: Module

Create TS-DART lobe.

Parameters

layer_sizeslist

The size of each layer of the encoder. The last component should represent the dimension of the euclidean space where the latent hypersphere is embedded.

n_statesint

Number of metastable states to be specified.

scaleint, default = 1

The radius of the hypersphere.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class tsdart.model.TSDARTModel(lobe, device=None, dtype=<class 'numpy.float32'>)

Bases: object

The TS-DART model from TS-DART.

Parameters

lobetorch.nn.Module

TS-DART lobe.

devicetorch device, default = None

The device on which the torch modules are executed.

dtypedtype, default = np.float32

The data type of the input data and the parameters of the model.

property lobe
transform(data, return_type='probs')

Transform the original trajectores to different outputs after training.

Parameters

datalist or ndarray

The original trajectories.

return_typestring

‘probs’: the softmax probabilties to assign each conformation to a metastable state. ‘states’: the metastable state assignments of each conformation. ‘hypersphere_embs’: the hyperspherical embeddings of each conformation.

Returns

The transformed trajectories.

tsdart.utils module

tsdart.utils.calculate_inverse(matrix, epsilon=1e-06, return_sqrt=False, mode='regularize')

This method can be applied to compute the inverse or the square-root of the inverse of the matrix, this method will be further used to estimate koopman matrix.

Parameters

matrixtorch.Tensor

The matrix to be inverted.

epsilonfloat, default = 1e-6

The regularization/trunction parameters for eigenvalues.

return_sqrtboolean, optional, default = False

If True, the square root of the inverse matrix is returned instead.

modestr, default = ‘regularize’

‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.

Returns

inversetorch.Tensor

Inverse of the matrix.

tsdart.utils.compute_covariance_matrix(x: Tensor, y: Tensor, remove_mean=True)

This method can be applied to compute the covariance matrix from two batches of data.

Parameters

xtorch.Tensor

The first batch of data of shape [batch_size, num_basis].

ytorch.Tensor

The second batch of data of shape [batch_size, num_basis].

remove_meanboolean, optional, default = True

Whether to remove mean of the data.

Returns

(cov_00, cov_01, cov11)Tuple[torch.Tensor, torch.Tensor, torch.Tensor]

Instantaneous covariance matrix of x, time-lagged covariance matrix of x and y, and instantaneous covariance matrix of y.

tsdart.utils.eig_decomposition(matrix, epsilon=1e-06, mode='regularize')

This method can be applied to do the eig-decomposition for a rank deficient hermetian matrix, this method will be further used to estimate koopman matrix.

Parameters

matrixtorch.Tensor

The hermitian matrix: specifically, the covariance matrix.

epsilonfloat, default = 1e-6

The regularization/trunction parameters for eigenvalues.

modestr, default = ‘regularize’

‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.

Returns

(eigval, eigvec)Tuple[torch.Tensor, torch.Tensor]

Eigenvalues and eigenvectors.

tsdart.utils.estimate_koopman_matrix(data: Tensor, data_lagged: Tensor, epsilon=1e-06, mode='regularize', symmetrized=False)

This method can be applied to compute the koopman matrix from time-instant and time-lagged data.

Parameters

datatorch.Tensor

The time-instant data of shape [batch_size, num_basis].

data_laggedtorch.Tensor

The time-lagged data of shape [batch_size, num_basis].

epsilonfloat, default = 1e-6

The regularization/trunction parameters for eigenvalues.

modestr, default = ‘regularize’

‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.

symmetrizedboolean, default = False

Whether to symmetrize time-correlation matrices or not.

Returns

koopman_matrixtorch.Tensor

The koopman matrix of shape [num_basis, num_basis].

tsdart.utils.map_data(data, device=None, dtype=<class 'numpy.float32'>)

This function is used to yield the torch.Tensor type data from multiple trajectories.

Parameters

datalist or tuple or ndarray

The trajectories of data.

devicetorch device, default = None

The device on which the torch modules are executed.

dtypedtype, default = np.float32

The data type of the input data and the parameters of the model.

Yields

xtorch.Tensor

The mapped data.

tsdart.utils.set_random_seed(seed)

Set a random seed.