The Open Force Field Toolkit supports the SMIRNOFF virtual site specification
for models using off-site charges, including 4- and 5-point water models, in
addition to lone pair modelling on various functional groups. The primary focus
is on the ability to parameterize a system using virtual sites, and generate an
OpenMM system with all virtual sites present and ready for evaluation. Support
for formats other than OpenMM has not yet been implemented, but may come with
the appearance of the OpenFF system object. In addition to implementing the
specification, the toolkit
Molecule object allows the creation and manipulation of virtual sites.
Virtual sites can be added to a System in two ways:
SMIRNOFF Force Fields can contain a VirtualSites tag, specifying the addition of virtual sites according to SMARTS-based rules.
Virtual sites directly depend on 3D conformation, because the position of the virtual sites depends on vectors defined by the atoms specified during parameterization. Because of this, a virtual site matching the triplet of atoms 1-2-3 will define a point that is different from a triplet matching 3-2-1. This is similar to defining “right-handed” and “left-handed” coordinate systems. This subtlety plays into two major concepts in force field development:
We sometimes want to define a single virtual site describing two points with the same parameters (distance, angle, etc.) via symmetry, such as 5-point water models.
We have a match that produces multiple orderings of the atoms (e.g. if wildcards are present in the SMARTS pattern), and we only want one to be applied.
Case (1) is very useful for parameter optimization, where a single SMARTS-based
parameter can be used to optimize both points, such as the angle defining the
virtual points for a 5-point water model. Case (2) is the typical scenario for
the nitrogen lone pair in ammonia, where only one point needs to be specified.
We discuss a few more illustrative examples below. Beyond these attributes, the
virtual site specification allows a policy for specifying how to handle
exclusions in the OpenMM force evaluator. The current default is to add
pairwise energy exclusions in the OpenMM system between a virtual site and all
tagged atoms matched in its SMARTS(
exclusion_policy="parents", ). Currently
the single atom that the virtual site defines as the “origin”. For water, for
"minimal" would mean just the oxygen, whereas
"parents" would mean
all three atoms.
In order to give consistent and intended behavior, the specification was
modified from its draft form in following manner: The
attributes have been added to each virtual site parameter type. These changes
specifying different virtual site types using the same atoms
allowing two virtual sites with the same type and same atoms but different physical parameters to be added simultaneously
allowing the ability to control whether the virtual site encodes one or multiple particles, based on the number of ways the matching atoms can be ordered.
"name" attribute encodes whether the virtual site to be added should
override an existing virtual site of the same type (e.g. hierarchy preference),
or if this virtual site should be added in addition to the other existing
virtual sites on the given atoms. This means that different virtual site types
can share the same group of parent atoms and use the same name without
overwriting each other (the default
EP for all sites, which gives
the expected hierarchical behavior used in other SMIRNOFF tags).
"match" attribute accepts either
offering control for situations where a SMARTS pattern can possibly match the
same group of atoms in different orders(either due to wildcards or local
symmetry) and it is desired to either add just one or all of the possible
virtual particles. The default value is
"all_permutations", but for
TrivalentLonePair it is always set to
"once", regardless of what the file
contains, since all orderings always place the particle in the exact same
The following cases exemplify our reasoning in implementing this behavior, and should draw caution to complex issues that may arise when designing virtual site parameters. Let us consider 4-, 5-, and 6-point water models:
A 4-point water model with a
DivalentLonePair: This can be implemented by specifying
distance=-.15*angstrom". Since the SMIRKS pattern
"[#1:1]-[#8X2:2]- [#2:3]"would match water twice and would create two particles in the exact same position if
all_permutationswas specified, we specify
"once"to have only one particle generated. Although having two particles in the same position should not affect the physics if the proper exclusion policy is applied, it would effectively make the 4-point model just as expensive as 5-point models.
A 5-point water model with a
DivalentLonePair: This can be implemented by using
match="all_permutations"(unlike the 4-point model),
distance=0.7*angstrom, for example. Here the permutations will cause particles to be placed at ±56.26 degrees, and changing any of the physical quantities will affect both particles.
A 6-point water model with both
DivalentLonePairsites above. Since these two parameters look identical, it is unclear whether they should both be applied or if one should override the other. The toolkit never compares the physical numbers to determine equality as this can lead to instability during e.g. parameter fitting. To get this to work, we specify
name="EP1"for the first parameter, and
name="EP2"for the second parameter. This instructs the parameter handler keep them separate, and therefore both are applied. (If both had the same name, then the typical SMIRNOFF hierarchy rules are used, and only the last matched parameter would be applied.)
BondChargevirtual site. Since we want a
BondChargeon both ends, we specify
MonovalentLonePairvirtual site(s) on the oxygen, with the aim of modeling both lone pairs. This one is subtle, since
[#1:3]-[#6X3:2]=[#8X1:1]matches two unique groups of atoms (
2-3-4). It is important to note in this situation that
match="all_permutations"behaves exactly the same as
match="once". Due to the anchoring hydrogens (
2) being symmetric but opposite about the bond between
4, a single parameter does correctly place both lone pairs. A standing issue here is that the default exclusion policy (
parents) will allow these two virtual sites to interact since they have different indexed atoms (parents), causing the energy to be different than the non-virtual site parameterization. In the future, the
exclusion_policy="local"will account for this, and make virtual sites that share at least one “parent” atom not interact with each other. As a special note: when applying a
MonovalentLonePairto a completely symmetric molecule, e.g. water,
all_permutationscan come into play, but this will apply two particles (one for each hydrogen).
Finally, the toolkit handles the organization of atoms and virtual sites in a
specific manner. Virtual sites are expected to be added after all molecules in
the topology are present. This is because the Open Force Field Toolkit
organizes a topology by placing all atoms first, then all virtual sites last.
This differs from the OpenMM Modeller object, for example, which interleaves
the order of atoms and virtual sites in such a way that all particles of a
molecule are contiguous. In addition, due to the fact that a virtual site may
contain multiple particles coupled to single parameters, the toolkit makes a
distinction between a virtual site, and a virtual particle. A virtual site
may represent multiple virtual particles, so the total number of particles
cannot be directly determined by simply summing the number of atoms and virtual
sites in a molecule. This is taken into account, however, and the
Topology classes both implement