dimers
#
Train against dimer energies.
Classes:
-
Dimer–Represents a single experimental data point.
Functions:
-
create_dataset–Create a dataset from a list of existing dimers.
-
create_dataset_from_generator–Create a dataset from a generator function, avoiding loading all dimers into
-
create_from_des–Create a dataset from a DESXXX dimer set.
-
extract_smiles–Return a list of unique SMILES strings in the dataset.
-
compute_dimer_energy–Compute the energy of a dimer in a series of conformers.
-
predict–Predict the energies of each dimer in the dataset.
-
default_closure–Return a default closure function for training against dimer energies.
-
report–Generate a report comparing the predicted and reference energies of each dimer.
Dimer
#
Bases: TypedDict
Represents a single experimental data point.
create_dataset
#
create_dataset(dimers: list[Dimer]) -> Dataset
Create a dataset from a list of existing dimers.
Parameters:
-
dimers(list[Dimer]) –The dimers to create the dataset from.
Returns:
-
Dataset–The created dataset.
Source code in descent/targets/dimers.py
create_dataset_from_generator
#
create_dataset_from_generator(
gen_fn: Callable[[], Iterator[Dimer]],
) -> Dataset
Create a dataset from a generator function, avoiding loading all dimers into memory at once.
Parameters:
-
gen_fn(Callable[[], Iterator[Dimer]]) –A callable that returns an iterator of dimers. It will be called by the HuggingFace datasets library and must be re-iterable (i.e. each call to
gen_fn()should produce a fresh iterator).
Returns:
-
Dataset–The created dataset.
Source code in descent/targets/dimers.py
create_from_des
#
Create a dataset from a DESXXX dimer set.
Parameters:
-
data_dir(Path) –The path to the DESXXX directory.
-
energy_fn(EnergyFn) –A function which computes the reference energy of a dimer. This should take as input a pandas DataFrame containing the metadata for a given group, a tuple of geometry IDs, and a tensor of coordinates with
shape=(n_dimers, n_atoms, 3). It should return a tensor of energies withshape=(n_dimers,)and units of [kcal/mol].
Returns:
-
Dataset–The created dataset.
Source code in descent/targets/dimers.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 | |
extract_smiles
#
Return a list of unique SMILES strings in the dataset.
Parameters:
-
dataset(Dataset) –The dataset to extract the SMILES strings from.
Returns:
-
list[str]–The list of unique SMILES strings.
Source code in descent/targets/dimers.py
compute_dimer_energy
#
compute_dimer_energy(
topology_a: TensorTopology,
topology_b: TensorTopology,
force_field: TensorForceField,
coords: Tensor,
) -> Tensor
Compute the energy of a dimer in a series of conformers.
Parameters:
-
topology_a(TensorTopology) –The topology of the first monomer.
-
topology_b(TensorTopology) –The topology of the second monomer.
-
force_field(TensorForceField) –The force field to use.
-
coords(Tensor) –The coordinates of the dimer with
shape=(n_dimers, n_atoms, 3).
Returns:
-
Tensor–The energy [kcal/mol] of the dimer in each conformer.
Source code in descent/targets/dimers.py
predict
#
predict(
dataset: Dataset,
force_field: TensorForceField,
topologies: dict[str, TensorTopology],
) -> tuple[Tensor, Tensor]
Predict the energies of each dimer in the dataset.
Parameters:
-
dataset(Dataset) –The dataset to predict the energies of.
-
force_field(TensorForceField) –The force field to use.
-
topologies(dict[str, TensorTopology]) –The topologies of each monomer. Each key should be a fully mapped SMILES string.
Returns:
-
tuple[Tensor, Tensor]–The reference and predicted energies [kcal/mol] of each dimer, each with
shape=(n_dimers * n_conf_per_dimer,).
Source code in descent/targets/dimers.py
default_closure
#
default_closure(
trainable: Trainable,
topologies: dict[str, TensorTopology],
dataset: Dataset,
)
Return a default closure function for training against dimer energies.
Parameters:
-
trainable(Trainable) –The wrapper around trainable parameters.
-
topologies(dict[str, TensorTopology]) –The topologies of the molecules present in the dataset, with keys of mapped SMILES patterns.
-
dataset(Dataset) –The dataset to train against.
Returns:
-
–
The default closure function.
Source code in descent/targets/dimers.py
report
#
report(
dataset: Dataset,
force_fields: dict[str, TensorForceField],
topologies: dict[str, dict[str, TensorTopology]],
output_path: Path,
)
Generate a report comparing the predicted and reference energies of each dimer.
Parameters:
-
dataset(Dataset) –The dataset to generate the report for.
-
force_fields(dict[str, TensorForceField]) –The force fields to use to predict the energies.
-
topologies(dict[str, dict[str, TensorTopology]]) –The topologies of each monomer for the given force field. Each key should be a fully mapped SMILES string. The name of the force field must also be present in force_fields
-
output_path(Path) –The path to write the report to.
Source code in descent/targets/dimers.py
369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 | |