tsdart package
tsdart.dataprocessing module
- class tsdart.dataprocessing.Preprocessing(dtype=<class 'numpy.float32'>)
Bases:
object
Preprocess the original trajectories to create datasets for training.
Parameters
dtype : dtype, default = np.float32
- create_dataset(data, lag_time)
Create the dataset as the input to TS-DART.
Parameters
- datalist or ndarray
The original trajectories.
- lag_timeint
The lag_time used to create the dataset consisting of time-instant and time-lagged data.
Returns
- datasetlist
List of tuples: the length of the list represents the number of data. Each tuple has two elements: one is the instantaneous data frame, the other is the corresponding time-lagged data frame.
tsdart.loss module
- class tsdart.loss.DisLoss(feat_dim, n_states, device, proto_update_factor=0.5, scaling_temperature=0.1)
Bases:
Module
Compute dispersion loss.
Parameters
- feat_dimint
The dimension of the euclidean space where the latent hypersphere is embedded. The dimension of latent hypersphere is (feat_dim-1).
- n_statesint
Number of metastable states to be specified.
- devicetorch.device
The device on which the torch modules are executed.
- proto_update_factorfloat, default = 0.5
The state center update factor.
- scaling_temperaturefloat, default = 0.1
The scaling hyperparameter to compute dispersion loss.
- clear()
Clear the list of recorded dispersion losses.
- forward(features, labels)
Compute dispersion loss at every call.
Parameters
- featurestorch.Tensor
Hyperspherical embeddings of a batch of data.
- labelstorch.Tensor
Metastable states of a batch of data.
Returns
- losstorch.Tensor
Dispersion loss
- output_mean_score()
Output the average of recorded dispersion losses within the list.
Returns
- mean_scoretorch.Tensor
The averaged dispersion loss
- save()
Save the dispersion loss to the list.
- class tsdart.loss.Prototypes(n_states, device, scaling_temperature=0.1)
Bases:
Module
Compute the prototypes (state center vectors). Used for evaluating validation data.
Parameters
- n_statesint
Number of metastable states to be specified.
- devicetorch.device
The device on which the torch modules are executed.
- scaling_temperaturefloat, default = 0.1
The scaling hyperparameter to compute dispersion loss.
- clear()
Clear the lists of recorded state centers and dispersion losses.
- forward(features, labels)
Compute dispersion loss and state center vectors at every call.
Parameters
- featurestorch.Tensor
Hyperspherical embeddings of a batch of data.
- labelstorch.Tensor
Metastable states of a batch of data.
Returns
- prototypestorch.Tensor
State center vectors of shape [n_states, feat_dim].
- class tsdart.loss.VAMPLoss(epsilon=1e-06, mode='regularize', symmetrized=False)
Bases:
Module
Compute VAMP2 loss.
Parameters
- epsilonfloat, default = 1e-6
The regularization/trunction parameters for eigenvalues.
- modestr, default = ‘regularize’
‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
- symmetrizedboolean, default = False
Whether to symmetrize time-correlation matrices or not.
- clear()
Clear the list.
- forward(data)
Compute VAMP2 loss at every call.
Parameters
- datatuple
Softmax probabilities of batch of transition pairs.
Returns
- losstorch.Tensor
VAMP-2 loss
- output_mean_score()
Output the average of recorded VAMP2 scores within the list.
Returns
- mean_scoretorch.Tensor
The averaged VAMP-2 score
- save()
Save the VAMP2 score to the list.
tsdart.model module
- class tsdart.model.TSDART(lobe, optimizer='Adam', device=None, learning_rate=0.001, epsilon=1e-06, mode='regularize', symmetrized=False, dtype=<class 'numpy.float32'>, feat_dim=2, n_states=4, proto_update_factor=0.5, scaling_temperature=0.1, beta=0.01, save_model_interval=None, pretrain=0, print=False)
Bases:
object
The method used to train TS-DART.
Parameters
- datalist or ndarray
The original trajectories.
- optimizerstr, default = ‘Adam’
The type of optimizer used for training.
- devicetorch.device, default = None
The device on which the torch modules are executed.
- learning_ratefloat, default = 1e-3
The learning rate of the optimizer.
- epsilonfloat, default = 1e-6
The strength of the regularization/truncation under which matrices are inverted.
- modestr, default = ‘regularize’
‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
- symmetrizedboolean, default = False
Whether to symmetrize time-correlation matrices or not.
- dtypedtype, default = np.float32
The data type of the input data and the parameters of the model.
- feat_dimint, default = 2
The dimension of the euclidean space where the latent hypersphere is embedded. The dimension of latent hypersphere is (feat_dim-1).
- n_statesint, default = 4
Number of metastable states to be specified.
- proto_update_factorfloat, default = 0.5
The state center update factor.
- scaling_temperaturefloat, default = 0.1
The scaling hyperparameter to compute dispersion loss.
- betafloat, default = 0.01
The weight of dispersion loss.
- save_model_intervalint, default = None
Saving the model every ‘save_model_interval’ epochs.
- pretrainint, default = 0
The number of epochs of the pretraining with pure VAMP2 loss.
- printboolean, default = False
Whether to print the validation loss every epoch during the training.
- fetch_model()
Yields the current model.
- fit(train_loader, n_epochs=1, validation_loader=None, progress=<class 'tqdm.std.tqdm'>)
Performs fit on data.
Parameters
- train_loadertorch.utils.data.DataLoader
Yield a tuple of batches representing instantaneous and time-lagged samples for training.
- n_epochsint, default=1
The number of epochs to use for training. Note that n_epochs should be larger than pretrain.
- validation_loadertorch.utils.data.DataLoader, optional, default=None
Yield a tuple of batches representing instantaneous and time-lagged samples for validation.
progress : context manager, default=tqdm
Returns
self : TSDART
- partial_fit(data)
Performs partial fit on one batch of data.
Parameters
- datatuple
The data containing the a batch of time-instantaneous and a batch of time-lagged data.
Returns
self : TSDART
- property training_dis
- property training_vamp
- validate(val_data)
Evaluate the current model on validation data.
Parameters
- val_datatuple
The validation data containing the a batch of time-instantaneous and a batch of time-lagged data.
- property validation_dis
- property validation_prototypes
- property validation_vamp
- class tsdart.model.TSDARTEstimator(tsdart_model: TSDARTModel)
Bases:
object
The TS-DART estimator the generate the state center vectors and ood scores of original trajectories.
Parameters
- tsdart_modelTSDARTModel
The trained TS-DART model.
- fit(data)
Fit the TS-DART model with original trajectories to compute OOD scores and state center vectors.
Parameters
- datalist or ndarray
The original trajectories.
Returns
self : TSDARTEstimator
- property ood_scores
- property state_centers
- class tsdart.model.TSDARTLayer(layer_sizes: list, n_states: int, scale=1)
Bases:
Module
Create TS-DART lobe.
Parameters
- layer_sizeslist
The size of each layer of the encoder. The last component should represent the dimension of the euclidean space where the latent hypersphere is embedded.
- n_statesint
Number of metastable states to be specified.
- scaleint, default = 1
The radius of the hypersphere.
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class tsdart.model.TSDARTModel(lobe, device=None, dtype=<class 'numpy.float32'>)
Bases:
object
The TS-DART model from TS-DART.
Parameters
- lobetorch.nn.Module
TS-DART lobe.
- devicetorch device, default = None
The device on which the torch modules are executed.
- dtypedtype, default = np.float32
The data type of the input data and the parameters of the model.
- property lobe
- transform(data, return_type='probs')
Transform the original trajectores to different outputs after training.
Parameters
- datalist or ndarray
The original trajectories.
- return_typestring
‘probs’: the softmax probabilties to assign each conformation to a metastable state. ‘states’: the metastable state assignments of each conformation. ‘hypersphere_embs’: the hyperspherical embeddings of each conformation.
Returns
The transformed trajectories.
tsdart.utils module
- tsdart.utils.calculate_inverse(matrix, epsilon=1e-06, return_sqrt=False, mode='regularize')
This method can be applied to compute the inverse or the square-root of the inverse of the matrix, this method will be further used to estimate koopman matrix.
Parameters
- matrixtorch.Tensor
The matrix to be inverted.
- epsilonfloat, default = 1e-6
The regularization/trunction parameters for eigenvalues.
- return_sqrtboolean, optional, default = False
If True, the square root of the inverse matrix is returned instead.
- modestr, default = ‘regularize’
‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
Returns
- inversetorch.Tensor
Inverse of the matrix.
- tsdart.utils.compute_covariance_matrix(x: Tensor, y: Tensor, remove_mean=True)
This method can be applied to compute the covariance matrix from two batches of data.
Parameters
- xtorch.Tensor
The first batch of data of shape [batch_size, num_basis].
- ytorch.Tensor
The second batch of data of shape [batch_size, num_basis].
- remove_meanboolean, optional, default = True
Whether to remove mean of the data.
Returns
- (cov_00, cov_01, cov11)Tuple[torch.Tensor, torch.Tensor, torch.Tensor]
Instantaneous covariance matrix of x, time-lagged covariance matrix of x and y, and instantaneous covariance matrix of y.
- tsdart.utils.eig_decomposition(matrix, epsilon=1e-06, mode='regularize')
This method can be applied to do the eig-decomposition for a rank deficient hermetian matrix, this method will be further used to estimate koopman matrix.
Parameters
- matrixtorch.Tensor
The hermitian matrix: specifically, the covariance matrix.
- epsilonfloat, default = 1e-6
The regularization/trunction parameters for eigenvalues.
- modestr, default = ‘regularize’
‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
Returns
- (eigval, eigvec)Tuple[torch.Tensor, torch.Tensor]
Eigenvalues and eigenvectors.
- tsdart.utils.estimate_koopman_matrix(data: Tensor, data_lagged: Tensor, epsilon=1e-06, mode='regularize', symmetrized=False)
This method can be applied to compute the koopman matrix from time-instant and time-lagged data.
Parameters
- datatorch.Tensor
The time-instant data of shape [batch_size, num_basis].
- data_laggedtorch.Tensor
The time-lagged data of shape [batch_size, num_basis].
- epsilonfloat, default = 1e-6
The regularization/trunction parameters for eigenvalues.
- modestr, default = ‘regularize’
‘regularize’: regularize the eigenvalues by adding epsilon. ‘trunc’: truncate the eigenvalues by filtering out the eigenvalues below epsilon.
- symmetrizedboolean, default = False
Whether to symmetrize time-correlation matrices or not.
Returns
- koopman_matrixtorch.Tensor
The koopman matrix of shape [num_basis, num_basis].
- tsdart.utils.map_data(data, device=None, dtype=<class 'numpy.float32'>)
This function is used to yield the torch.Tensor type data from multiple trajectories.
Parameters
- datalist or tuple or ndarray
The trajectories of data.
- devicetorch device, default = None
The device on which the torch modules are executed.
- dtypedtype, default = np.float32
The data type of the input data and the parameters of the model.
Yields
- xtorch.Tensor
The mapped data.
- tsdart.utils.set_random_seed(seed)
Set a random seed.