Models¶

The models are the classes used to build, train and test a Gaussian process, and to then build the relative mapped potential. There are six types of models at the moment, each one is used to handle 2-, 3-, or 2+3-body kernels in the case of one or two atomic species. When creating a model, it is therefore necessary to decide a priori the type of Gaussian process and, therefore, the type of mapped potential we want to obtain.

Building a model¶

To create a model based on a 2-body kernel for a monoatomic system:

from mff import models
mymodel = models.TwoBodySingleSpecies(atomic_number, cutoff_radius, sigma, theta, noise)

where the parameters refer to the atomic number of the species we are training the GP on, the cutoff radius we want to use, the lengthscale hyperparameter of the Gaussian Process, the hyperparameter governing the exponential decay of the cutoff function, and the noise associated with the output training data. In the case of a 2+3-body kernel for a monoatomic system:

from mff import models
mymodel = models.CombinedSingleSpecies(atomic_number, cutoff_radius, sigma_2b, theta_2b, sigma_3b, theta_3b, noise)

where we have two additional hyperparameters since the lengthscale value and the cutoff decay ratio of the 2- and 3-body kernels contained inside the combined Gaussian Process are be independent.

When dealing with a two-element system, the syntax is very similar, but the atomic_number is instead a list containing the atomic numbers of the two species, in increasing order:

from mff import models
mymodel = models.CombinedTwoSpecies(atomic_numbers, cutoff_radius, sigma_2b, theta_2b, sigma_3b, theta_3b, noise)

Fitting the model¶

Once a model has been built, we can train it using a dataset of forces, energies, or energies and forces, that has been created using the configurations module. If we are training only on forces:

mymodel.fit(training_confs, training_forces)

training only on energies:

mymodel.fit_energy(training_confs, training_energies)

training on both forces and energies:

mymodel.fit_force_and_energy(training_confs, training_forces, training_energies)

Additionaly, the argument ncores can be passed to any fit function in order to run the process on multiple processors:

mymodel.fit(training_confs, training_forces, ncores = 4)

Predicting forces and energies with the GP¶

Once the Gaussian process has been fitted, it can be used to directly predict forces and energies on test configurations. To predict the force and the energy for a single test configuration:

force = mymodel.predict(test_configuration, ncores)
energy = mymodel.predict_energy(test_configuration, ncores)

the boolean variable return_std can be passed to the force and energy predict functions in order to obtain also the standard deviation associated with the prediction, default is False:

mean_force, std_force = mymodel.predict(test_configuration, return_std = True)

Additionaly, the argument ncores can be passed to any predict function in order to run the process on multiple processors:

force = mymodel.predict(test_configuration, ncores = 4)

Building a mapped potential¶

Once the Gaussian process has been fitted, either via force, energy or joint force/energy fit, it can be mapped onto a non-parametric 2- and/or 3-body potential using the build_grid function. The build_grid function takes as arguments the minimum grid distance (smallest distance between atoms for which the potential will be defined), the number of grid points to use while building the 2-body mapped potential, and the number of points per dimension to use while building the 3-body mapped potential. For a 2-body model:

mymodel.build_grid(grid start, num_2b)

For a 3-body model:

mymodel.build_grid(grid start, num_3b)

For a combined model:

mymodel.build_grid(grid start, num_2b, num_3b)

Additionaly, the argument ncores can be passed to the build_grid function for any model in order to run the process on multiple processors:

mymodel.build_grid(grid start, num_2b, num_3b, ncores  = 4)

Saving and loading a model¶

At any stage, a model can be saved using the save function that takes a .json filename as the only input:

mymodel.save("thismodel.json")

the save function will create a .json file containing all of the parameters and hyperparameters of the model, and the paths to the .npy and .npz files containing, respectively, the saved GPs and the saved mapped potentials, which are also created by the save funtion.

To load a previously saved model of a known type (here for example a CombinedSingleSpecies model) simply run:

mymodel = models.CombinedSingleSpecies.from_json("thismodel.json")

Model’s complete reference¶

Two Body Model¶

Module containing the TwoBodySingleSpecies and TwoBodyTwoSpecies classes, which are used to handle the Gaussian process and the mapping algorithm used to build M-FFs. The model has to be first defined, then the Gaussian process must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated 2-body potential via the build grid function call. A mapped model can be then saved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.TwoBodySingleSpecies(atomic_number, cutoff_radius, sigma, theta, noise)
mymodel.fit(training_confs, training_forces, ncores)

forces = mymodel.predict(test_configurations, ncores)

mymodel.build_grid(grid_start, num_2b, ncores)
mymodel.save("thismodel.json")

mymodel = models.TwoBodySingleSpecies.from_json("thismodel.json")

class mff.models.twobody.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶

default(obj)¶

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

class mff.models.twobody.TwoBodyManySpeciesModel(elements, r_cut, sigma, theta, noise, rep_sig=1, **kwargs)¶

2-body many species model class Class managing the Gaussian process and its mapped counterpart

Parameters

elements (list) – List containing the atomic numbers in increasing order
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
theta (float) – decay ratio of the cutoff function in the Gaussian Process
noise (float) – noise value associated with the training output data

gp¶

The 2-body two species Gaussian Process

Type: class

grid¶

Contains the three 2-body two species tabulated potentials, accounting for interactions between two atoms of types 0-0, 0-1, and 1-1.

Type: list

grid_start¶

Minimum atomic distance for which the grid is defined (cannot be 0)

Type: float

grid_num¶

number of points used to create the 2-body grids

Type: int

build_grid(start, num, ncores=1)¶

Build the mapped 2-body potential. Calculates the energy predicted by the GP for two atoms at distances that range from start to r_cut, for a total of num points. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any couple of atoms. The total force or local energy can then be calculated for any atom by summing the pairwise contributions of every other atom within a cutoff distance r_cut. Three distinct potentials are built for interactions between atoms of type 0 and 0, type 0 and 1, and type 1 and 1. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 2-body mapped potential
num (int) – number of points to use in the grid of the mapped potential

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a 2-body single species force-force kernel

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 2-body single species energy-energy kernel

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training forces and energies using 2-body single species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the models. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

load_gp(filename)¶: Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)¶

Predict the global energies of the central atoms of confs using a GP

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename)¶: Saves the GP object, now obsolete

class mff.models.twobody.TwoBodySingleSpeciesModel(element, r_cut, sigma, theta, noise, rep_sig=1, **kwargs)¶

2-body single species model class Class managing the Gaussian process and its mapped counterpart

Parameters

element (int) – The atomic number of the element considered
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
theta (float) – decay ratio of the cutoff function in the Gaussian Process
noise (float) – noise value associated with the training output data

gp¶

The 2-body single species Gaussian Process

Type: method

grid¶

The 2-body single species tabulated potential

Type: method

grid_start¶

Minimum atomic distance for which the grid is defined (cannot be 0.0)

Type: float

grid_num¶

number of points used to create the 2-body grid

Type: int

build_grid(start, num, ncores=1)¶

Build the mapped 2-body potential. Calculates the energy predicted by the GP for two atoms at distances that range from start to r_cut, for a total of num points. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any couple of atoms. The total force or local energy can then be calculated for any atom by summing the pairwise contributions of every other atom within a cutoff distance r_cut. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 2-body mapped potential
num (int) – number of points to use in the grid of the mapped potential

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a 2-body single species force-force kernel

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 2-body single species energy-energy kernel

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training forces and energies using 2-body single species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

load_gp(filename)¶: Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)¶

Predict the global energies of the central atoms of confs using a GP

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename)¶: Saves the GP object, now obsolete

Three Body Model¶

Module containing the ThreeBodySingleSpecies and ThreeBodyTwoSpecies classes, which are used to handle the Gaussian process regression and the mapping algorithm used to build M-FFs. The model has to be first defined, then the Gaussian process must be trained using training configurations and forces (and/or local energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated 3-body potential via the build grid function call. A mapped model can be then saved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.ThreeBodySingleSpecies(atomic_number, cutoff_radius, sigma, theta, noise)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(grid_start, num_3b, ncores)
mymodel.save("thismodel.json")
mymodel = models.CombinedSingleSpecies.from_json("thismodel.json")

class mff.models.threebody.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶

default(obj)¶

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

class mff.models.threebody.ThreeBodyManySpeciesModel(elements, r_cut, sigma, theta, noise, **kwargs)¶

3-body many species model class Class managing the Gaussian process and its mapped counterpart

Parameters

elements (list) – List containing the atomic numbers in increasing order
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
theta (float) – decay ratio of the cutoff function in the Gaussian Process
noise (float) – noise value associated with the training output data

gp¶

The 3-body two species Gaussian Process

Type: class

grid¶

Contains the three 3-body two species tabulated potentials, accounting for interactions between three atoms of types 0-0-0, 0-0-1, 0-1-1, and 1-1-1.

Type: list

grid_start¶

Minimum atomic distance for which the grid is defined (cannot be 0)

Type: float

grid_num¶

number of points per side used to create the 3-body grids. These are 3-dimensional grids, therefore the total number of grid points will be grid_num^3.

Type: int

build_grid(start, num, ncores=1)¶

Function used to create the four different 3-body energy grids for atoms of elements 0-0-0, 0-0-1, 0-1-1, and 1-1-1. The function calls the build_grid_3b function for each of those combinations of elements.

Parameters

start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential
num (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the mapped potential
ncores (int) – number of CPUs to use to calculate the energy predictions

build_grid_3b(dists, element_i, element_j, element_k, ncores)¶

Build a mapped 3-body potential. Calculates the energy predicted by the GP for three atoms of elements element_i, element_j, element_k, at all possible combinations of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality), found by calling the generate_triplets_with_permutation_invariance function. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)
element_i (int) – atomic number of the central atom i in a triplet
element_j (int) – atomic number of the second atom j in a triplet
element_k (int) – atomic number of the third atom k in a triplet
ncores (int) – number of CPUs to use when computing the triplet local energies

Returns

a 3D spline object that can be used to predict the energy and the force associated: to the central atom of a triplet.

Return type

spline3D (obj)

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a 3-body two species force-force kernel function

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 3-body two species energy-energy kernel function

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
energies (array) – Array containing the scalar local energies of the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training forces and energies using 3-body two species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
energies (array) – Array containing the scalar local energies of the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

static generate_triplets_all(dists)¶

Generate a list of all valid triplets Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be: evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

static generate_triplets_with_permutation_invariance(dists)¶

Generate a list of all valid triplets using perutational invariance. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be: evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename)¶: Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)¶

Predict the local energies of the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename)¶: Saves the GP object, now obsolete

update_energy(glob_confs, energies, ncores=1)¶

Update a fitted GP with a set of energies and using 3-body two species energy-energy kernels

Parameters

glob_confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
energies (array) – Array containing the scalar local energies of the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

update_force(confs, forces, ncores=1)¶

Update a fitted GP with a set of forces and using 3-body twp species force-force kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

class mff.models.threebody.ThreeBodySingleSpeciesModel(element, r_cut, sigma, theta, noise, **kwargs)¶

3-body single species model class Class managing the Gaussian process and its mapped counterpart

Parameters

element (int) – The atomic number of the element considered
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
theta (float) – decay ratio of the cutoff function in the Gaussian Process
noise (float) – noise value associated with the training output data

gp¶

The 3-body single species Gaussian Process

Type: method

grid¶

The 3-body single species tabulated potential

Type: method

grid_start¶

Minimum atomic distance for which the grid is defined (cannot be 0.0)

Type: float

grid_num¶

number of points per side used to create the 3-body grid. This is a 3-dimensional grid, therefore the total number of grid points will be grid_num^3.

Type: int

build_grid(start, num, ncores=1)¶

Build the mapped 3-body potential. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential
num (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the mapped potential
ncores (int) – number of CPUs to use to calculate the energy predictions

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a 3-body single species force-force kernel function

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 3-body single species energy-energy kernel function

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
energies (array) – Array containing the scalar local energies of the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training forces and energies using 3-body single species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
energies (array) – Array containing the scalar local energies of the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

static generate_triplets(dists)¶

Generate a list of all valid triplets using perutational invariance. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be: evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename)¶: Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(confs, return_std=False, ncores=1)¶

Predict the global energies of the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename)¶: Saves the GP object, now obsolete

Combined Model¶

Module that uses 2- and 3-body kernels to do Guassian process regression, and to build 2- and 3-body mapped potentials. The model has to be first defined, then the Gaussian processes must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated 2-body potential and a tabultaed 3-body potential via the build grid function call. A mapped model can be thensaved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.CombinedSingleSpecies(atomic_number, cutoff_radius,
                       sigma_2b, sigma_3b, sigma_2b, theta_3b, noise)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(grid_start, num_2b, ncores)
mymodel.save("thismodel.json")
mymodel = models.CombinedSingleSpecies.from_json("thismodel.json")

class mff.models.combined.CombinedManySpeciesModel(elements, r_cut, sigma_2b, sigma_3b, theta_2b, theta_3b, noise, rep_sig=1, **kwargs)¶

2- and 3-body many species model class Class managing the Gaussian processes and their mapped counterparts

Parameters

elements (list) – List containing the atomic numbers in increasing order
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma_2b (foat) – Lengthscale parameter of the 2-body Gaussian process
sigma_3b (foat) – Lengthscale parameter of the 2-body Gaussian process
theta_2b (float) – decay ratio of the cutoff function in the 2-body Gaussian Process
theta_3b (float) – decay ratio of the cutoff function in the 3-body Gaussian Process
noise (float) – noise value associated with the training output data

gp_2b¶

The 2-body single species Gaussian Process

Type: method

gp_3b¶

The 3-body single species Gaussian Process

Type: method

grid_2b¶

Contains the three 2-body two species tabulated potentials, accounting for interactions between two atoms of types 0-0, 0-1, and 1-1.

Type: list

grid_2b

Contains the three 3-body two species tabulated potentials, accounting for interactions between three atoms of types 0-0-0, 0-0-1, 0-1-1, and 1-1-1.

Type: list

grid_start¶

Minimum atomic distance for which the grids are defined (cannot be 0.0)

Type: float

grid_num_2b¶

number of points to use in the grid of the 2-body mapped potential

Type: int

grid_num_3b¶

number of points to use to generate the list of distances used to generate the triplets of atoms for the 2-body mapped potential

Type: int

build_grid(start, num_2b, num_3b, ncores=1)¶

Function used to create the three different 2-body energy grids for atoms of elements 0-0, 0-1, and 1-1, and the four different 3-body energy grids for atoms of elements 0-0-0, 0-0-1, 0-1-1, and 1-1-1. The function calls the build_grid_3b function for each of the 3-body grids to build.

Parameters

start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential
num (int) – number of points to use in the grid of the 2-body mapped potentials
num_3b (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the 3-body mapped potentials
ncores (int) – number of CPUs to use to calculate the energy predictions

build_grid_3b(dists, element_k, element_i, element_j, ncores=1)¶

Build a mapped 3-body potential. Calculates the energy predicted by the GP for three atoms of elements element_i, element_j, element_k, at all possible combinations of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality), found by calling the generate_triplets_with_permutation_invariance function. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)
element_i (int) – atomic number of the central atom i in a triplet
element_j (int) – atomic number of the second atom j in a triplet
element_k (int) – atomic number of the third atom k in a triplet
ncores (int) – number of CPUs to use when computing the triplet local energies

Returns

a 3D spline object that can be used to predict the energy and the force associated: to the central atom of a triplet.

Return type

spline3D (obj)

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a 2- and 3-body single species force-force kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training forces and the 2-body predictions of force on the training configurations

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 2- and 3-body single species energy-energy kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies and the 2-body predictions of energies on the training configurations.

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 2- and 3-body single species force-force, energy-energy, and energy-forces kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies (and forces) and the 2-body predictions of energies (and forces) on the training configurations.

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the model. Loads the model, the associated GPs and the mapped potentials, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

static generate_triplets_all(dists)¶

Generate a list of all valid triplets. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be: evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename_2b, filename_3b)¶: Loads the GP objects, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GPs forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)¶

Predict the local energies of the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename_2b, filename_3b)¶: Saves the GP objects, now obsolete

class mff.models.combined.CombinedSingleSpeciesModel(element, r_cut, sigma_2b, sigma_3b, theta_2b, theta_3b, noise, rep_sig=1, **kwargs)¶

2- and 3-body single species model class Class managing the Gaussian processes and their mapped counterparts

Parameters

element (int) – The atomic number of the element considered
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma_2b (foat) – Lengthscale parameter of the 2-body Gaussian process
sigma_3b (foat) – Lengthscale parameter of the 2-body Gaussian process
theta_2b (float) – decay ratio of the cutoff function in the 2-body Gaussian Process
theta_3b (float) – decay ratio of the cutoff function in the 3-body Gaussian Process
noise (float) – noise value associated with the training output data

gp_2b¶

The 2-body single species Gaussian Process

Type: method

gp_3b¶

The 3-body single species Gaussian Process

Type: method

grid_2b¶

The 2-body single species tabulated potential

Type: method

grid_3b¶

The 3-body single species tabulated potential

Type: method

grid_start¶

Minimum atomic distance for which the grids are defined (cannot be 0.0)

Type: float

grid_num¶

number of points per side used to create the 2- and 3-body grid. The 3-body grid is 3-dimensional, therefore its total number of grid points will be grid_num^3

Type: int

build_grid(start, num_2b, num_3b, ncores=1)¶

Build the mapped 2- and 3-body potentials. Calculates the energy predicted by the GP for two and three atoms at all possible combination of num distances ranging from start to r_cut. The energy for the 3-body mapped grid is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed 2-body energies are stored in an array of values, and a 1D spline interpolation is created. The computed 3-body energies are stored in a 3D cube of values, and a 3D spline interpolation is created. The total force or local energy can then be calculated for any atom by summing the pairwise and triplet contributions of every valid couple and triplet of atoms of which one is always the central one. The prediction is done by the calculator module, which is built to work within the ase python package.

Parameters

start (float) – smallest interatomic distance for which the energy is predicted by the GP and stored inn the 3-body mapped potential
num_2b (int) – number of points to use in the grid of the 2-body mapped potential
num_3b (int) – number of points to use to generate the list of distances used to generate the triplets of atoms for the 2-body mapped potential
ncores (int) – number of CPUs to use to calculate the energy predictions

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a 2- and 3-body single species force-force kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training forces and the 2-body predictions of force on the training configurations

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 2- and 3-body single species energy-energy kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies and the 2-body predictions of energies on the training configurations.

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a 2- and 3-body single species force-force, energy-energy, and energy-forces kernel functions. The 2-body Gaussian process is first fitted, then the 3-body GP is fitted to the difference between the training energies (and forces) and the 2-body predictions of energies (and forces) on the training configurations.

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the model. Loads the model, the associated GPs and the mapped potentials, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

static generate_triplets(dists)¶

Generate a list of all valid triplets using perutational invariance. Calculates the energy predicted by the GP for three atoms at all possible combination of num distances ranging from start to r_cut. The energy is calculated only for valid triplets of atoms, i.e. sets of three distances which form a triangle (this is checked via the triangle inequality). The grid building exploits all the permutation invariances to reduce the number of energy calculations needed to fill the grid. The computed energies are stored in a 3D cube of values, and a 3D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any triplet of atoms. The total force or local energy can then be calculated for any atom by summing the triplet contributions of every valid triplet of atoms of which one is always the central one. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

dists (array) – array of floats containing all of the distances which can be used to build triplets of atoms. This array is created by calling np.linspace(start, r_cut, num)

Returns

array of booleans indicating which triplets (three distance values) need to be: evaluated to fill the 3D grid of energy values.

r_ij_x (array): array containing the x coordinate of the second atom j w.r.t. the central atom i r_ki_x (array): array containing the x coordinate of the third atom k w.r.t. the central atom i r_ki_y (array): array containing the y coordinate of the third atom k w.r.t. the central atom i

Return type

inds (array)

load_gp(filename_2b, filename_3b)¶: Loads the GP objects, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GPs forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)¶

Predict the local energies of the central atoms of confs using the 2- and 3-body GPs. The total force is the sum of the two predictions.

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potentials, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename_2b, filename_3b)¶: Saves the GP objects, now obsolete

class mff.models.combined.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶

default(obj)¶

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

Eam Model¶

Module that uses an eam-like many-body kernel to do Guassian process regression, and to build eam mapped potentials. The model has to be first defined, then the Gaussian processes must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto a tabulated eam potential via the build grid function call. A mapped model can be thensaved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.EamSingleSpecies(atomic_number, cutoff_radius,
                        sigma, alpha, r0, noise)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(grid_start, num_2b, ncores)
mymodel.save("thismodel.json")
mymodel = models.EamSingleSpecies.from_json("thismodel.json")

TwoThreeEam Model

class mff.models.eam.EamManySpeciesModel(elements, r_cut, sigma, r0, noise, **kwargs)¶

Eam many species model class Class managing the Gaussian process and its mapped counterpart

Parameters

elements (int) – The atomic numbers of the element considered
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
r0 (float) – radius in the exponent of the eam descriptor
noise (float) – noise value associated with the training output data

gp¶

The eam single species Gaussian Process

Type: method

grid¶

The eam single species tabulated potential

Type: method

grid_start¶

Minimum descriptor value for which the grid is defined

Type: float

grid_end¶

Maximum descriptor value for which the grid is defined

Type: float

grid_num¶

number of points used to create the eam multi grid

Type: int

build_grid(num, ncores=1)¶

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

num (int) – number of points to use in the grid of the mapped potential
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

load_gp(filename)¶: Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)¶

Predict the global energies of the central atoms of confs using a GP

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename)¶: Saves the GP object, now obsolete

class mff.models.eam.EamSingleSpeciesModel(element, r_cut, sigma, r0, noise, **kwargs)¶

Eam single species model class Class managing the Gaussian process and its mapped counterpart

Parameters

element (int) – The atomic number of the element considered
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
theta (float) – decay ratio of the cutoff function in the Gaussian Process
noise (float) – noise value associated with the training output data

gp¶

The eam single species Gaussian Process

Type: method

grid¶

The eam single species tabulated potential

Type: method

grid_start¶

Minimum descriptor value for which the grid is defined

Type: float

grid_end¶

Maximum descriptor value for which the grid is defined

Type: float

grid_num¶

number of points used to create the eam multi grid

Type: int

build_grid(num, ncores=1)¶

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

num (int) – number of points to use in the grid of the mapped potential
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)¶

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)¶

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)¶

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)¶

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

load_gp(filename)¶: Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)¶

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)¶

Predict the global energies of the central atoms of confs using a GP

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)¶

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename)¶: Saves the GP object, now obsolete

class mff.models.eam.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶

default(obj)¶

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

Module that uses a 2+3+many-body kernel to do Guassian process regression, and to build eam mapped potentials. The model has to be first defined, then the Gaussian processes must be trained using training configurations and forces (and/or energies). Once a model has been trained, it can be used to predict forces (and/or energies) on unknonwn atomic configurations. A trained Gaussian process can then be mapped onto tabulated 2, 3 and eam potentials via the build grid function call. A mapped model can be thensaved, loaded and used to run molecular dynamics simulations via the calculator module. These mapped potentials retain the accuracy of the GP used to build them, while speeding up the calculations by a factor of 10^4 in typical scenarios.

Example:

from mff import models
mymodel = models.TwoThreeEamSingleSpecies(atomic_number, cutoff_radius, sigma_2b, sigma_3b,
                               sigma_eam, theta_2b, theta_3b, alpha, r0, noise, rep_sig)
mymodel.fit(training_confs, training_forces, ncores)
forces = mymodel.predict(test_configurations, ncores)
mymodel.build_grid(start, start_eam, end_eam, num_2b, num_3b, num_eam, ncores)
mymodel.save("thismodel.json")
mymodel = models.EamSingleSpecies.from_json("thismodel.json")

class mff.models.eam.EamManySpeciesModel(elements, r_cut, sigma, r0, noise, **kwargs)

Eam many species model class Class managing the Gaussian process and its mapped counterpart

Parameters

elements (int) – The atomic numbers of the element considered
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
r0 (float) – radius in the exponent of the eam descriptor
noise (float) – noise value associated with the training output data

gp

The eam single species Gaussian Process

Type: method

grid

The eam single species tabulated potential

Type: method

grid_start

Minimum descriptor value for which the grid is defined

Type: float

grid_end

Maximum descriptor value for which the grid is defined

Type: float

grid_num

number of points used to create the eam multi grid

Type: int

build_grid(num, ncores=1)

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

num (int) – number of points to use in the grid of the mapped potential
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

load_gp(filename): Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename): Saves the GP object, now obsolete

class mff.models.eam.EamSingleSpeciesModel(element, r_cut, sigma, r0, noise, **kwargs)

Eam single species model class Class managing the Gaussian process and its mapped counterpart

Parameters

element (int) – The atomic number of the element considered
r_cut (foat) – The cutoff radius used to carve the atomic environments
sigma (foat) – Lengthscale parameter of the Gaussian process
theta (float) – decay ratio of the cutoff function in the Gaussian Process
noise (float) – noise value associated with the training output data

gp

The eam single species Gaussian Process

Type: method

grid

The eam single species tabulated potential

Type: method

grid_start

Minimum descriptor value for which the grid is defined

Type: float

grid_end

Maximum descriptor value for which the grid is defined

Type: float

grid_num

number of points used to create the eam multi grid

Type: int

build_grid(num, ncores=1)

Build the mapped eam potential. Calculates the energy predicted by the GP for a configuration which eam descriptor is evalued between start and end. These energies are stored and a 1D spline interpolation is created, which can be used to predict the energy and, through its analytic derivative, the force associated to any embedded atom. The prediction is done by the calculator module which is built to work within the ase python package.

Parameters

num (int) – number of points to use in the grid of the mapped potential
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit(confs, forces, ncores=1)

Fit the GP to a set of training forces using a eam single species force-force kernel

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_energy(glob_confs, energies, ncores=1)

Fit the GP to a set of training energies using a eam single species energy-energy kernel

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

fit_force_and_energy(confs, forces, glob_confs, energies, ncores=1)

Fit the GP to a set of training forces and energies using eam single species force-force, energy-force and energy-energy kernels

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
forces (array) – Array containing the vector forces on the central atoms of the training configurations
glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
energies (array) – Array containing the total energy of each snapshot
ncores (int) – number of CPUs to use for the gram matrix evaluation

classmethod from_json(path)

Load the model. Loads the model, the associated GP and the mapped potential, if available.

Parameters: path (str) – path to the .json model file
Returns: the model object
Return type: model (obj)

load_gp(filename): Loads the GP object, now obsolete

predict(confs, return_std=False, ncores=1)

Predict the forces acting on the central atoms of confs using a GP

Parameters

confs (list) – List of M x 5 arrays containing coordinates and atomic numbers of atoms within a cutoff from the central one
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

array of force vectors predicted by the GP forces_errors (array): errors associated to the force predictions,

Return type

forces (array)

predict_energy(glob_confs, return_std=False, ncores=1)

Predict the global energies of the central atoms of confs using a GP

Parameters

glob_confs (list of lists) – List of configurations arranged so that grouped configurations belong to the same snapshot
return_std (bool) – if True, returns the standard deviation associated to predictions according to the GP framework

Returns

Array containing the total energy of each snapshot energies_errors (array): errors associated to the energies predictions,

Return type

energies (array)

save(path)

Save the model. This creates a .json file containing the parameters of the model and the paths to the GP objects and the mapped potential, which are saved as separate .gpy and .gpz files, respectively.

Parameters: path (str) – path to the file

save_gp(filename): Saves the GP object, now obsolete

class mff.models.eam.NpEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)

default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)