Models

The models are the classes used to build, train and test a Gaussian process, and to then build the relative mapped potential. There are six types of models at the moment, each one is used to handle 2-, 3-, or 2+3-body kernels in the case of one or two atomic species. When creating a model, it is therefore necessary to decide a priori the type of Gaussian process and, therefore, the type of mapped potential we want to obtain.

Building a model

To create a model based on a 2-body kernel for a monoatomic system:

from mff import models
mymodel = models.TwoBodySingleSpecies(atomic_number, cutoff_radius, sigma, theta, noise)

where the parameters refer to the atomic number of the species we are training the GP on, the cutoff radius we want to use, the lengthscale hyperparameter of the Gaussian Process, the hyperparameter governing the exponential decay of the cutoff function, and the noise associated with the output training data. In the case of a 2+3-body kernel for a monoatomic system:

from mff import models
mymodel = models.CombinedSingleSpecies(atomic_number, cutoff_radius, sigma_2b, theta_2b, sigma_3b, theta_3b, noise)

where we have two additional hyperparameters since the lengthscale value and the cutoff decay ratio of the 2- and 3-body kernels contained inside the combined Gaussian Process are be independent.

When dealing with a two-element system, the syntax is very similar, but the atomic_number is instead a list containing the atomic numbers of the two species, in increasing order:

from mff import models
mymodel = models.CombinedTwoSpecies(atomic_numbers, cutoff_radius, sigma_2b, theta_2b, sigma_3b, theta_3b, noise)

Fitting the model

Once a model has been built, we can train it using a dataset of forces, energies, or energies and forces, that has been created using the configurations module. If we are training only on forces:

mymodel.fit(training_confs, training_forces)

training only on energies:

mymodel.fit_energy(training_confs, training_energies)

training on both forces and energies:

mymodel.fit_force_and_energy(training_confs, training_forces, training_energies)

Additionaly, the argument ncores can be passed to any fit function in order to run the process on multiple processors:

mymodel.fit(training_confs, training_forces, ncores = 4)

Predicting forces and energies with the GP

Once the Gaussian process has been fitted, it can be used to directly predict forces and energies on test configurations. To predict the force and the energy for a single test configuration:

force = mymodel.predict(test_configuration, ncores)
energy = mymodel.predict_energy(test_configuration, ncores)

the boolean variable return_std can be passed to the force and energy predict functions in order to obtain also the standard deviation associated with the prediction, default is False:

mean_force, std_force = mymodel.predict(test_configuration, return_std = True)

Additionaly, the argument ncores can be passed to any predict function in order to run the process on multiple processors:

force = mymodel.predict(test_configuration, ncores = 4)

Building a mapped potential

Once the Gaussian process has been fitted, either via force, energy or joint force/energy fit, it can be mapped onto a non-parametric 2- and/or 3-body potential using the build_grid function. The build_grid function takes as arguments the minimum grid distance (smallest distance between atoms for which the potential will be defined), the number of grid points to use while building the 2-body mapped potential, and the number of points per dimension to use while building the 3-body mapped potential. For a 2-body model:

mymodel.build_grid(grid start, num_2b)

For a 3-body model:

mymodel.build_grid(grid start, num_3b)

For a combined model:

mymodel.build_grid(grid start, num_2b, num_3b)

Additionaly, the argument ncores can be passed to the build_grid function for any model in order to run the process on multiple processors:

mymodel.build_grid(grid start, num_2b, num_3b, ncores  = 4)

Saving and loading a model

At any stage, a model can be saved using the save function that takes a .json filename as the only input:

mymodel.save("thismodel.json")

the save function will create a .json file containing all of the parameters and hyperparameters of the model, and the paths to the .npy and .npz files containing, respectively, the saved GPs and the saved mapped potentials, which are also created by the save funtion.

To load a previously saved model of a known type (here for example a CombinedSingleSpecies model) simply run:

mymodel = models.CombinedSingleSpecies.from_json("thismodel.json")

Model’s complete reference

Two Body Model

Module containing the TwoBodySingleSpecies and TwoBodyTwoSpecies classes, which are used to handle the Gaussian process and the mapping algorithm used to build M-FFs. The model has to be first defined, then the Gaussian process must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated 2-body potential via the build grid function call. A mapped model can be then saved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.TwoBodySingleSpecies(atomic_number, cutoff_radius, sigma, theta, noise)
mymodel.fit(training_confs, training_forces, ncores)

forces = mymodel.predict(test_configurations, ncores)

mymodel.build_grid(grid_start, num_2b, ncores)
mymodel.save("thismodel.json")

mymodel = models.TwoBodySingleSpecies.from_json("thismodel.json")
class mff.models.twobody.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)
default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
class mff.models.twobody.TwoBodyManySpeciesModel(elements, r_cut, sigma, theta, noise, rep_sig=1, **kwargs)

2-body many species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • elements (list) – List containing the atomic numbers in increasing order

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • theta (float) – decay ratio of the cutoff function in the Gaussian Process

  • noise (float) – noise value associated with the training output data

gp

The 2-body two species Gaussian Process

Type

class

grid

Contains the three 2-body two species tabulated potentials, accounting for interactions between two atoms of types 0-0, 0-1, and 1-1.

Type

list

grid_start

Minimum atomic distance for which the grid is defined (cannot be 0)

Type

float

grid_num

number of points used to create the 2-body grids

Type

int

build_grid(start, num, ncores=1)

Build the mapped 2-body potential. Calculates the energy predicted by the GP for two atoms at distances that range from start to r_cut, for a total of num points. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any couple of atoms. The total force or local energy can then be calculated for any atom by summing the pairwise contributions of every other atom within a cutoff distance r_cut. Three distinct potentials are built for interactions between atoms of type 0 and 0, type 0 and 1, and type 1 and 1. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 2-body mapped potential

  • num (int) – number of points to use in the grid of the mapped potential

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a 2-body single species force-force kernel

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 2-body single species energy-energy kernel

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using 2-body single species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the models. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

class mff.models.twobody.TwoBodySingleSpeciesModel(element, r_cut, sigma, theta, noise, rep_sig=1, **kwargs)

2-body single species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • element (int) – The atomic number of the element considered

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • theta (float) – decay ratio of the cutoff function in the Gaussian Process

  • noise (float) – noise value associated with the training output data

gp

The 2-body single species Gaussian Process

Type

method

grid

The 2-body single species tabulated potential

Type

method

grid_start

Minimum atomic distance for which the grid is defined (cannot be 0.0)

Type

float

grid_num

number of points used to create the 2-body grid

Type

int

build_grid(start, num, ncores=1)

Build the mapped 2-body potential. Calculates the energy predicted by the GP for two atoms at distances that range from start to r_cut, for a total of num points. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any couple of atoms. The total force or local energy can then be calculated for any atom by summing the pairwise contributions of every other atom within a cutoff distance r_cut. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 2-body mapped potential

  • num (int) – number of points to use in the grid of the mapped potential

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a 2-body single species force-force kernel

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 2-body single species energy-energy kernel

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using 2-body single species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

Three Body Model

Module containing the ThreeBodySingleSpecies and ThreeBodyTwoSpecies classes, which are used to handle the Gaussian process regression and the mapping algorithm used to build M-FFs. The model has to be first defined, then the Gaussian process must be trained using training configurations and forces (and/or local energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated 3-body potential via the build grid function call. A mapped model can be then saved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.ThreeBodySingleSpecies(atomic_number, cutoff_radius, sigma, theta, noise)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(grid_start, num_3b, ncores)
mymodel.save("thismodel.json")
mymodel = models.CombinedSingleSpecies.from_json("thismodel.json")
class mff.models.threebody.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)
default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
class mff.models.threebody.ThreeBodyManySpeciesModel(elements, r_cut, sigma, theta, noise, **kwargs)

3-body many species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • elements (list) – List containing the atomic numbers in increasing order

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • theta (float) – decay ratio of the cutoff function in the Gaussian Process

  • noise (float) – noise value associated with the training output data

gp

The 3-body two species Gaussian Process

Type

class

grid

Contains the three 3-body two species tabulated potentials, accounting for interactions between three atoms of types 0-0-0, 0-0-1, 0-1-1, and 1-1-1.

Type

list

grid_start

Minimum atomic distance for which the grid is defined (cannot be 0)

Type

float

grid_num

number of points per side used to create the 3-body grids. These are 3-dimensional grids, therefore the total number of grid points will be grid_num^3.

Type

int

build_grid(start, num, ncores=1)

Function used to create the four different 3-body energy grids for atoms of elements 0-0-0, 0-0-1, 0-1-1, and 1-1-1. The function calls the build_grid_3b function for each of those combinations of elements.

Parameters
  • start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential

  • num (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the mapped potential

  • ncores (int) – number of CPUs to use to calculate the energy predictions

build_grid_3b(dists, element_i, element_j, element_k, ncores)

Build a mapped 3-body potential. Calculates the energy predicted by the GP for three atoms of elements element_i, element_j, element_k, at all possible combinations of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality), found by calling the generate_triplets_with_permutation_invariance function. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

  • element_i (int) – atomic number of the central atom i in a triplet

  • element_j (int) – atomic number of the second atom j in a triplet

  • element_k (int) – atomic number of the third atom k in a triplet

  • ncores (int) – number of CPUs to use when computing the triplet local energies

Returns

a 3D spline object that can be used to predict the energy and the force associated

to the central atom of a triplet.

Return type

spline3D (obj)

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a 3-body two species force-force kernel function

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 3-body two species energy-energy kernel function

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • energies (array) – Array containing the scalar local energies of the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using 3-body two species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • energies (array) – Array containing the scalar local energies of the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

static generate_triplets_all(dists)

Generate a list of all valid triplets Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be

evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

static generate_triplets_with_permutation_invariance(dists)

Generate a list of all valid triplets using perutational invariance. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be

evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the local energies of the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

update_energy(glob_confs, energies, ncores=1)

Update a fitted GP with a set of energies and using 3-body two species energy-energy kernels

Parameters
  • glob_confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • energies (array) – Array containing the scalar local energies of the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

update_force(confs, forces, ncores=1)

Update a fitted GP with a set of forces and using 3-body twp species force-force kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

class mff.models.threebody.ThreeBodySingleSpeciesModel(element, r_cut, sigma, theta, noise, **kwargs)

3-body single species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • element (int) – The atomic number of the element considered

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • theta (float) – decay ratio of the cutoff function in the Gaussian Process

  • noise (float) – noise value associated with the training output data

gp

The 3-body single species Gaussian Process

Type

method

grid

The 3-body single species tabulated potential

Type

method

grid_start

Minimum atomic distance for which the grid is defined (cannot be 0.0)

Type

float

grid_num

number of points per side used to create the 3-body grid. This is a 3-dimensional grid, therefore the total number of grid points will be grid_num^3.

Type

int

build_grid(start, num, ncores=1)

Build the mapped 3-body potential. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential

  • num (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the mapped potential

  • ncores (int) – number of CPUs to use to calculate the energy predictions

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a 3-body single species force-force kernel function

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 3-body single species energy-energy kernel function

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • energies (array) – Array containing the scalar local energies of the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using 3-body single species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • energies (array) – Array containing the scalar local energies of the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

static generate_triplets(dists)

Generate a list of all valid triplets using perutational invariance. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be

evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

Combined Model

Module that uses 2- and 3-body kernels to do Guassian process regression, and to build 2- and 3-body mapped potentials. The model has to be first defined, then the Gaussian processes must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated 2-body potential and a tabultaed 3-body potential via the build grid function call. A mapped model can be thensaved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.CombinedSingleSpecies(atomic_number, cutoff_radius,
                       sigma_2b, sigma_3b, sigma_2b, theta_3b, noise)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(grid_start, num_2b, ncores)
mymodel.save("thismodel.json")
mymodel = models.CombinedSingleSpecies.from_json("thismodel.json")
class mff.models.combined.CombinedManySpeciesModel(elements, r_cut, sigma_2b, sigma_3b, theta_2b, theta_3b, noise, rep_sig=1, **kwargs)

2- and 3-body many species model class Class managing the Gaussian processes and their mapped counterparts

Parameters
  • elements (list) – List containing the atomic numbers in increasing order

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma_2b (foat) – Lengthscale parameter of the 2-body Gaussian process

  • sigma_3b (foat) – Lengthscale parameter of the 2-body Gaussian process

  • theta_2b (float) – decay ratio of the cutoff function in the 2-body Gaussian Process

  • theta_3b (float) – decay ratio of the cutoff function in the 3-body Gaussian Process

  • noise (float) – noise value associated with the training output data

gp_2b

The 2-body single species Gaussian Process

Type

method

gp_3b

The 3-body single species Gaussian Process

Type

method

grid_2b

Contains the three 2-body two species tabulated potentials, accounting for interactions between two atoms of types 0-0, 0-1, and 1-1.

Type

list

grid_2b

Contains the three 3-body two species tabulated potentials, accounting for interactions between three atoms of types 0-0-0, 0-0-1, 0-1-1, and 1-1-1.

Type

list

grid_start

Minimum atomic distance for which the grids are defined (cannot be 0.0)

Type

float

grid_num_2b

number of points to use in the grid of the 2-body mapped potential

Type

int

grid_num_3b

number of points to use to generate the list of distances used to generate the triplets of atoms for the 2-body mapped potential

Type

int

build_grid(start, num_2b, num_3b, ncores=1)

Function used to create the three different 2-body energy grids for atoms of elements 0-0, 0-1, and 1-1, and the four different 3-body energy grids for atoms of elements 0-0-0, 0-0-1, 0-1-1, and 1-1-1. The function calls the build_grid_3b function for each of the 3-body grids to build.

Parameters
  • start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential

  • num (int) – number of points to use in the grid of the 2-body mapped potentials

  • num_3b (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the 3-body mapped potentials

  • ncores (int) – number of CPUs to use to calculate the energy predictions

build_grid_3b(dists, element_k, element_i, element_j, ncores=1)

Build a mapped 3-body potential. Calculates the energy predicted by the GP for three atoms of elements element_i, element_j, element_k, at all possible combinations of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality), found by calling the generate_triplets_with_permutation_invariance function. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

  • element_i (int) – atomic number of the central atom i in a triplet

  • element_j (int) – atomic number of the second atom j in a triplet

  • element_k (int) – atomic number of the third atom k in a triplet

  • ncores (int) – number of CPUs to use when computing the triplet local energies

Returns

a 3D spline object that can be used to predict the energy and the force associated

to the central atom of a triplet.

Return type

spline3D (obj)

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a 2- and 3-body single species force-force kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training forces and the 2-body predictions of force on the training configurations

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 2- and 3-body single species energy-energy kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies and the 2-body predictions of energies on the training configurations.

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 2- and 3-body single species force-force, energy-energy, and energy-forces kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies (and forces) and the 2-body predictions of energies (and forces) on the training configurations.

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GPs and the mapped potentials, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

static generate_triplets_all(dists)

Generate a list of all valid triplets. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be

evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename_2b, filename_3b)

Loads the GP objects, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GPs forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the local energies of the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename_2b, filename_3b)

Saves the GP objects, now obsolete

class mff.models.combined.CombinedSingleSpeciesModel(element, r_cut, sigma_2b, sigma_3b, theta_2b, theta_3b, noise, rep_sig=1, **kwargs)

2- and 3-body single species model class Class managing the Gaussian processes and their mapped counterparts

Parameters
  • element (int) – The atomic number of the element considered

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma_2b (foat) – Lengthscale parameter of the 2-body Gaussian process

  • sigma_3b (foat) – Lengthscale parameter of the 2-body Gaussian process

  • theta_2b (float) – decay ratio of the cutoff function in the 2-body Gaussian Process

  • theta_3b (float) – decay ratio of the cutoff function in the 3-body Gaussian Process

  • noise (float) – noise value associated with the training output data

gp_2b

The 2-body single species Gaussian Process

Type

method

gp_3b

The 3-body single species Gaussian Process

Type

method

grid_2b

The 2-body single species tabulated potential

Type

method

grid_3b

The 3-body single species tabulated potential

Type

method

grid_start

Minimum atomic distance for which the grids are defined (cannot be 0.0)

Type

float

grid_num

number of points per side used to create the 2- and 3-body grid. The 3-body grid is 3-dimensional, therefore its total number of grid points will be grid_num^3

Type

int

build_grid(start, num_2b, num_3b, ncores=1)

Build the mapped 2- and 3-body potentials. Calculates the energy predicted by the GP for two and three atoms at all possible combination of num distances ranging from start to r_cut. The energy for the 3-body mapped grid is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed 2-body energies are stored in an array of values, and a 1D spline interpolation is created. The computed 3-body energies are stored in a 3D cube of values, and a 3D spline interpolation is created. The total force or local energy can then be calculated for any atom by summing the pairwise and triplet contributions of every valid couple and triplet of atoms of which one is always the central one. The prediction is done by the calculator module, which is built to work within the ase python package.

Parameters
  • start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential

  • num_2b (int) – number of points to use in the grid of the 2-body mapped potential

  • num_3b (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the 2-body mapped potential

  • ncores (int) – number of CPUs to use to calculate the energy predictions

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a 2- and 3-body single species force-force kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training forces and the 2-body predictions of force on the training configurations

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 2- and 3-body single species energy-energy kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies and the 2-body predictions of energies on the training configurations.

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a 2- and 3-body single species force-force, energy-energy, and energy-forces kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies (and forces) and the 2-body predictions of energies (and forces) on the training configurations.

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GPs and the mapped potentials, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

static generate_triplets(dists)

Generate a list of all valid triplets using perutational invariance. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be

evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename_2b, filename_3b)

Loads the GP objects, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GPs forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the local energies of the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename_2b, filename_3b)

Saves the GP objects, now obsolete

class mff.models.combined.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)
default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

Eam Model

Module that uses an eam-like many-body kernel to do Guassian process regression, and to build eam mapped potentials. The model has to be first defined, then the Gaussian processes must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated eam potential via the build grid function call. A mapped model can be thensaved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.EamSingleSpecies(atomic_number, cutoff_radius,
                        sigma, alpha, r0, noise)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(grid_start, num_2b, ncores)
mymodel.save("thismodel.json")
mymodel = models.EamSingleSpecies.from_json("thismodel.json")

TwoThreeEam Model

class mff.models.eam.EamManySpeciesModel(elements, r_cut, sigma, r0, noise, **kwargs)

Eam many species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • elements (int) – The atomic numbers of the element considered

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • r0 (float) – radius in the exponent of the eam descriptor

  • noise (float) – noise value associated with the training output data

gp

The eam single species Gaussian Process

Type

method

grid

The eam single species tabulated potential

Type

method

grid_start

Minimum descriptor value for which the grid is defined

Type

float

grid_end

Maximum descriptor value for which the grid is defined

Type

float

grid_num

number of points used to create the eam multi grid

Type

int

build_grid(num, ncores=1)

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • num (int) – number of points to use in the grid of the mapped potential

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

class mff.models.eam.EamSingleSpeciesModel(element, r_cut, sigma, r0, noise, **kwargs)

Eam single species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • element (int) – The atomic number of the element considered

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • theta (float) – decay ratio of the cutoff function in the Gaussian Process

  • noise (float) – noise value associated with the training output data

gp

The eam single species Gaussian Process

Type

method

grid

The eam single species tabulated potential

Type

method

grid_start

Minimum descriptor value for which the grid is defined

Type

float

grid_end

Maximum descriptor value for which the grid is defined

Type

float

grid_num

number of points used to create the eam multi grid

Type

int

build_grid(num, ncores=1)

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • num (int) – number of points to use in the grid of the mapped potential

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

class mff.models.eam.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)
default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

Module that uses a 2+3+many-body kernel to do Guassian process regression, and to build eam mapped potentials. The model has to be first defined, then the Gaussian processes must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto tabulated 2, 3 and eam potentials via the build grid function call. A mapped model can be thensaved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.TwoThreeEamSingleSpecies(atomic_number, cutoff_radius, sigma_2b, sigma_3b,
                               sigma_eam, theta_2b, theta_3b, alpha, r0, noise, rep_sig)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(start, start_eam, end_eam, num_2b, num_3b, num_eam, ncores)
mymodel.save("thismodel.json")
mymodel = models.EamSingleSpecies.from_json("thismodel.json")
class mff.models.eam.EamManySpeciesModel(elements, r_cut, sigma, r0, noise, **kwargs)

Eam many species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • elements (int) – The atomic numbers of the element considered

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • r0 (float) – radius in the exponent of the eam descriptor

  • noise (float) – noise value associated with the training output data

gp

The eam single species Gaussian Process

Type

method

grid

The eam single species tabulated potential

Type

method

grid_start

Minimum descriptor value for which the grid is defined

Type

float

grid_end

Maximum descriptor value for which the grid is defined

Type

float

grid_num

number of points used to create the eam multi grid

Type

int

build_grid(num, ncores=1)

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • num (int) – number of points to use in the grid of the mapped potential

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

class mff.models.eam.EamSingleSpeciesModel(element, r_cut, sigma, r0, noise, **kwargs)

Eam single species model class Class managing the Gaussian process and its mapped counterpart

Parameters
  • element (int) – The atomic number of the element considered

  • r_cut (foat) – The cutoff radius used to carve the atomic environments

  • sigma (foat) – Lengthscale parameter of the Gaussian process

  • theta (float) – decay ratio of the cutoff function in the Gaussian Process

  • noise (float) – noise value associated with the training output data

gp

The eam single species Gaussian Process

Type

method

grid

The eam single species tabulated potential

Type

method

grid_start

Minimum descriptor value for which the grid is defined

Type

float

grid_end

Maximum descriptor value for which the grid is defined

Type

float

grid_num

number of points used to create the eam multi grid

Type

int

build_grid(num, ncores=1)

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters
  • num (int) – number of points to use in the grid of the mapped potential

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • forces (array) – Array containing the vector forces on the central atoms of the training configurations

  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • energies (array) – Array containing the total energy of each snapshot

  • ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters

path (str) – path to the .json model file

Returns

the model object

Return type

model (obj)

load_gp(filename)

Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters
  • confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

returned only if return_std is True

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters
  • glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot

  • return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

returned only if return_std is True

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters

path (str) – path to the file

save_gp(filename)

Saves the GP object, now obsolete

class mff.models.eam.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)
default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)