Releases follow the
major.minor.micro scheme recommended by PEP440, where
majorincrements denote a change that may break API compatibility with previous
minorincrements add features but do not break API compatibility
microincrements represent bugfix releases or improvements in documentation
PR #751: Removes the optional
oetools=("oechem", "oequacpac", "oeiupac", "oeomega")keyword argument from
OpenEyeToolkitWrapper.is_available, as there are no special behaviors that are accessed in the case of partially-licensed OpenEye backends. The new behavior of this method is the same as if the default value above is always provided.
PR #583: Methods such as
Molecule.from_openeye, which delegate their internal logic to
ToolkitRegistryfunctions, now guarantee that they will return an object of the correct type when being called on
Molecule-derived classes. Previously, running these constructors using subclasses of
FrozenMoleculewould not return an instance of that subclass, but rather just an instance of a
0.8.0 - Virtual Sites¶
This release implements the SMIRNOFF virtual site specification. The implementation enables support for models using off-site charges, including 4- and 5-point water models, in addition to lone pair modeling on various functional groups. The primary focus was on the ability to parameterize a system using virtual sites, and generating an OpenMM system with all virtual sites present and ready for evaluation. Support for formats other than OpenMM has not be implemented in this release, but may come with the appearance of the OpenFF system object. In addition to implementing the specification, the toolkit
Molecule objects now allow the creation and manipulation of virtual sites.
Major Feature: Support for the SMIRNOFF VirtualSite tag
Virtual sites can be added to a System in two ways:
SMIRNOFF Force Fields can contain a VirtualSites tag , specifying the addition of virtual sites according to SMARTS-based rules.
Virtual sites are the first parameters which directly depend on 3D conformation, where the position of the virtual sites are based on vectors defined on the atoms that were matched during parameterization. Because of this, a virtual site matching the triplet of atoms 1-2-3 will define a point that is different from a triplet matching 3-2-1. This is similar to defining “right-handed” and “left-handed” coordinate systems. This subtlety interplays with two major concepts in force field development:
we sometimes want to define a single virtual site describing two points with the same parameters (distance, angle, etc.), such as 5-point water models
we have a match that produces multiple orderings of the atoms (e.g. if wildcards are present in the SMARTS pattern), and we only want one to be applied.
Case 1) is very useful for parameter optimization, where a single SMARTS-based parameter can be used to optimize both points, such as the angle defining the virtual points for a 5-point water model. Case 2) is the typical scenario for the nitrogen lone pair in ammonia, where only one point needs to be specified. We discuss a few more illustrative examples below. Beyond these attributes, the virtual site specification allows a policy for specifying how to handle exclusions in the OpenMM force evaluator. The current default is to add pairwise energy exclusions in the OpenMM system between a virtual site and all tagged atoms matched in its SMARTS (
exclusion_policy="parents", ). Currently defined are
"minimal" specifies the single atom that the virtual site defines as the “origin”. For water, for example,
"minimal" would mean just the oxygen, whereas
"parents" would mean all three atoms.
In order to give consistent and intended behavior, the specification was modified from its draft form in following manner: The
"match" attributes have been added to each virtual site parameter type. These changes allow for
specifying different virtual site types using the same atoms
allowing two virtual sites with the same type and same atoms but different physical parameters to be added simultaneously
allowing the ability to control whether the virtual site encodes one or multiple particles, based on the number of ways the matching atoms can be ordered.
"name" attribute encodes whether the virtual site to be added should override an existing virtual site of the same type (e.g. hierarchy preference), or if this virtual site should be added in addition to the other existing virtual sites on the given atoms. This means that different virtual site types can share the same group of parent atoms and use the same name without overwriting each other (the default
EP for all sites, which gives the expected hierarchical behavior used in other SMIRNOFF tags).
"match" attribute accepts either
"all_permutations", offering control for situations where a SMARTS pattern can possibly match the same group of atoms in different orders (either due to wildcards or local symmetry) and it is desired to either add just one or all of the possible virtual particles. The default value is
"all_permutations", but for TrivalentLonePair it is always set to
"once", regardless of what the file contains, since all orderings always place the particle in the exact same position.
The following cases exemplify our reasoning in implementing this behavior, and should draw caution to complex issues that may arise when designing virtual site parameters. Let us consider 4-, 5-, and 6-point water models:
A 4-point water model with a
DivalentLonePair: This can be implemented by specifying
distance=-.15*angstrom". Since the SMIRKS pattern
"[#1:1]-[#8X2:2]-[#2:3]"would match water twice and would create two particles in the exact same position if
all_permutationswas specified, we specify
"once"to have only one particle generated. Although having two particles in the same position should not affect the physics if the proper exclusion policy is applied, it would effectively make the 4-point model just as expensive as 5-point models.
A 5-point water model with a
DivalentLonePair: This can be implemented by using
match="all_permutations"(unlike the 4-point model),
distance=0.7*angstrom, for example. Here the permutations will cause particles to be placed at ±56.26 degrees, and changing any of the physical quantities will affect both particles.
A 6-point water model with both
DivalentLonePairsites above. Since these two parameters look identical, it is unclear whether they should both be applied or if one should override the other. The toolkit never compares the physical numbers to determine equality as this can lead to instability during e.g. parameter fitting. To get this to work, we specify
name="EP1"for the first parameter, and
name="EP2"for the second parameter. This instructs the parameter handler keep them separate, and therefore both are applied. (If both had the same name, then the typical SMIRNOFF hierarchy rules are used, and only the last matched parameter would be applied.)
BondChargevirtual site. Since we want a
BondChargeon both ends, we specify
MonovalentLonePairvirtual site(s) on the oxygen, with the aim of modeling both lone pairs. This one is subtle, since
[#1:3]-[#6X3:2]=[#8X1:1]matches two unique groups of atoms (
2-3-4). It is important to note in this situation that
match="all_permutations"behaves exactly the same as
match="once". Due to the anchoring hydrogens (
2) being symmetric but opposite about the bond between
4, a single parameter does correctly place both lone pairs. A standing issue here is that the default exclusion policy (
parents) will allow these two virtual sites to interact since they have different indexed atoms (parents), causing the energy to be different than the non-virtual site parameterization. In the future, the
exclusion_policy="local"will account for this, and make virtual sites that share at least one “parent” atom not interact with each other. As a special note: when applying a
MonovalentLonePairto a completely symmetric molecule, e.g. water,
all_permutationscan come into play, but this will apply two particles (one for each hydrogen).
Finally, the toolkit handles the organization of atoms and virtual sites in a specific manner. Virtual sites are expected to be added after all molecules in the topology are present. This is because the Open Force Field Toolkit organizes a topology by placing all atoms first, then all virtual sites last. This differs from the OpenMM Modeller object, for example, which interleaves the order of atoms and virtual sites in such a way that all particles of a molecule are contiguous. In addition, due to the fact that a virtual site may contain multiple particles coupled to single parameters, the toolkit makes a distinction between a virtual site, and a virtual particle. A virtual site may represent multiple virtual particles, so the total number of particles cannot be directly determined by simply summing the number of atoms and virtual sites in a molecule. This is taken into account, however, and the
Topology classes now implement
Minor Feature: Support for the 0.4 ChargeIncrementModel tag
To allow for more convenient fitting of
ChargeIncrement parameters, it is now possible to specify one less
charge_increment value than there are tagged atoms in a
smirks. The missing
charge_increment value will be calculated at parameterization-time to make the sum of
the charge contributions from a
ChargeIncrement parameter equal to zero.
Since this change allows for force fields that are incompatible with
the previous specification, this new style of
ChargeIncrement must specify a
section version of
ChargeIncrement parameters are compatible with
More details and examples of this change are available in The ChargeIncrementModel tag in the SMIRNOFF specification
PR #726: Adds support for the 0.4 ChargeIncrementModel spec, allowing for the specification of one fewer
charge_incrementvalues than there are tagged atoms in the
smirks, and automatically assigning the final atom an offsetting charge.
PR #548: Adds support for the
VirtualSitestag in the SMIRNOFF specification
PR #548: Adds
PR #548: Adds
PR #548: Adds
PR #705: Adds interpolation based on fractional bond orders for harmonic bonds. This includes interpolation for both the force constant
kand/or equilibrium bond distance
length. This is accompanied by a bump in the
<Bonds>section of the SMIRNOFF spec (but not the entire spec).
PR #743: Prevents the non-bonded (vdW) cutoff from silently falling back to the OpenMM default of 1 nm in
Forcefield.create_openmm_systemand instead sets its to the value specified by the force field.
PR #737: Prevents OpenEye from incidentally being used in the conformer generation step of
PR #705: Changes the default values in the
<Bonds>section of the SMIRNOFF spec to
potential="(k/2)*(r-length)^2", which is backwards-compatible with and equivalent to
PR #548: Adds a virtual site example notebook to run an OpenMM simulation with virtual sites, and compares positions and potential energy of TIP5P water between OpenFF and OpenMM forcefields.
PR #548: Methods
now only accept a list of atoms, not a list of integers, to define to parent atoms
PR #548: Removes
PR #548: Removes
0.7.2 - Bugfix and minor feature release¶
PR #649: Makes SMARTS searches stereochemistry-specific (if stereo is specified in the SMARTS) for both OpenEye and RDKit backends. Also ensures molecule aromaticity is re-perceived according to the ForceField’s specified aromaticity model, which may overwrite user-specified aromaticity on the
PR #648: Removes the
utils.structuremodule, which was deprecated in 0.2.0.
PR #675 changes the exception raised when no
antechamberexecutable is found from
PR #649: Prevents 2020 OE toolkit from issuing a warning caused by doing stereo-specific smarts searches on certain structures.
0.7.1 - OETK2020 Compatibility and Minor Update¶
This is the first of our patch releases on our new planned monthly release schedule.
Detailed release notes are below, but the major new features of this release are updates for
compatibility with the new 2020 OpenEye Toolkits release, the
get_available_force_fields function, and the disregarding of pyrimidal nitrogen stereochemistry
in molecule isomorphism checks.
PR #646: Checking for
Moleculeequality using the
==operator now disregards all pyrimidal nitrogen stereochemistry by default. To re-enable, use
PR #634: Fixes a bug in which calling
RDKitToolkitWrapper.from_filedirectly would not load files correctly if passed lowercase
file_format. Note that this bug did not occur when calling
PR #631: Fixes a bug in which calling
Nonewhen the unit is dimensionless. Now
PR #656: Adds a new allowed
am1elf10option to the OpenEye implementation of
assign_partial_chargeswhich calculates the average partial charges at the AM1 level of theory using conformers selected using the ELF10 method.
PR #643: Adds
openforcefield.typing.engines.smirnoff.forcefield.get_available_force_fields, which returns paths to the files of force fields available through entry point plugins.
0.7.0 - Charge Increment Model, Proper Torsion interpolation, and new Molecule methods¶
This is a relatively large release, motivated by the idea that changing existing functionality is bad so we shouldn’t do it too often, but when we do change things we should do it all at once.
Here’s a brief rundown of what changed, migration tips, and how to find more details in the full release notes below:
To provide more consistent partial charges for a given molecule, existing conformers are now disregarded by default by
Molecule.assign_partial_charges. Instead, new conformers are generated for use in semiempirical calculations. Search for
Formal charges are now always returned as
simtk.unit.Quantityobjects, with units of elementary charge. To convert them to integers, use
from simtk import unitand
The OpenFF Toolkit now automatically reads and writes partial charges in SDF files. Search for
The OpenFF Toolkit now has different behavior for handling multi-molecule and multi-conformer SDF files. Search
The OpenFF Toolkit now distinguishes between partial charges that are all-zero and partial charges that are unknown. Search
partial_charges = None.
Topology.to_openmmnow assigns unique atoms names by default. Search
Molecule equality checks are now done by graph comparison instead of SMILES comparison. Search
ChemicalEnvironmentmodule was almost entirely removed, as it is an outdated duplicate of some Chemper functionality. Search
TopologyMolecule.topology_particle_start_indexhas been removed from the
TopologyMoleculeAPI, since atoms and virtualsites are no longer contiguous in the
Topologyparticle indexing system. Search
compute_wiberg_bond_ordershas been renamed to
There are also a number of new features, such as:
ChargeIncrementModelsections in force fields.
kinterpolation in force fields using fractional bond orders.
Support for AM1-Mulliken, Gasteiger, and other charge methods using the new
Support for AM1-Wiberg bond order calculation using either the OpenEye or RDKit/AmberTools backends and the
Initial (limited) interoperability with QCArchive, via
Moleculemethods, including state enumeration and mapped SMILES creation.
Major Feature: Support for the SMIRNOFF ChargeIncrementModel tag
The ChargeIncrementModel tag in the SMIRNOFF specification provides analagous functionality to AM1-BCC, except that instead of AM1-Mulliken charges, a number of different charge methods can be called, and instead of a fixed library of two-atom charge corrections, an arbitrary number of SMIRKS-based, N-atom charge corrections can be defined in the SMIRNOFF format.
The initial implementation of the SMIRNOFF
ChargeIncrementModel tag accepts keywords for
partial_charge_method can be any string, and it is
up to the
compute_partial_charges methods to understand what they mean. For
number_of_conformers should be set to zero.
SMIRKS-based parameter application for
ChargeIncrement parameters is different than other SMIRNOFF sections.
The initial implementation of
ChargeIncrementModelHandler follows these rules:
an atom can be subject to many
ChargeIncrementparameters, which combine additively.
ChargeIncrementthat matches a set of atoms is overwritten only if another
ChargeIncrementmatches the same group of atoms, regardless of order. This overriding follows the normal SMIRNOFF hierarchy.
To give a concise example, what if a molecule
A-B(-C)-D were being parametrized, and the force field
ChargeIncrement SMIRKS in the following order?
In the case above, the ChargeIncrement from parameters 1 and 4 would NOT be applied to the molecule, since another parameter matching the same set of atoms is specified further down in the parameter hierarchy (despite those subsequent matches being in a different order).
Ultimately, the ChargeIncrement contributions from parameters 2, 3, and 5 would be summed and applied.
It’s also important to identify a behavior that these rules were written to avoid: if not for the
“regardless of order” clause in the second rule, parameters 4 and 5 could actually have been applied six and two times,
respectively (due to symmetry in the SMIRKS and the use of wildcards). This situation could also arise as a result
of molecular symmetry. For example, a methyl group could match the SMIRKS
[C:1]([H:2])([H:3])([H:4]) six ways
(with different orderings of the three hydrogen atoms), but the user would almost certainly not intend for the charge
increments to be applied six times. The “regardless of order” clause was added specifically to address this.
In short, the first time a group of atoms becomes involved in a
ChargeIncrement together, the System gains a new
parameter “slot”. Only another
ChargeIncrement which applies to the exact same group of atoms (in any order) can
take over the “slot”, pushing the original
Major Feature: Support for ProperTorsion k value interpolation
Chaya Stern’s work
showed that we may be able to produce higher-quality proper torsion parameters by taking into
account the “partial bond order” of the torsion’s central bond. We now have the machinery to compute AM1-Wiberg
partial bond orders for entire molecules using the
assign_fractional_bond_orders methods of either
AmberToolsToolkitWrapper. The thought is that, if some simple electron population analysis shows
that a certain aromatic bond’s order is 1.53, maybe rotations about that bond can be described well by interpolating
53% of the way between the single and double bond k values.
Full details of how to define a torsion-interpolating SMIRNOFF force fields are available in the ProperTorsions section of the SMIRNOFF specification.
PR #508: In order to provide the same results for the same chemical species, regardless of input conformation,
assign_fractional_bond_ordersmethods now default to ignore input conformers and generate new conformer(s) of the molecule before running semiempirical calculations. Users can override this behavior by specifying the keyword argument
PR #281: If a
partial_chargesattribute is set to
None(the default value), calling
to_openeyewill now produce a OE molecule with partial charges set to
nan. This would previously produce an OE molecule with partial charges of 0.0, which was a loss of information, since it wouldn’t be clear whether the original OFFMol’s partial charges were REALLY all-zero as opposed to
None. OpenEye toolkit wrapper methods such as
from_filenow produce OFFMols with
partial_charges = Nonewhen appropriate (previously these would produce OFFMols with all-zero charges, for the same reasoning as above).
to_rdkitnow sets partial charges on the RDAtom’s
PartialChargesproperty (this was previously set on the
partial_chargesproperty). If the
Molecule’s partial_charges attribute is
None, this property will not be defined on the RDAtoms.
PR #281: Enforce the behavior during SDF I/O that a SDF may contain multiple molecules, but that the OFF Toolkit does not assume that it contains multiple conformers of the same molecule. This is an important distinction, since otherwise there is ambiguity around whether properties of one entry in a SDF are shared among several molecule blocks or not, or how to resolve conflicts if properties are defined differently for several “conformers” of chemically-identical species (More info here). If the user requests the OFF Toolkit to write a multi-conformer
Moleculeto SDF, only the first conformer will be written. For more fine-grained control of writing properties, conformers, and partial charges, consider using
Molecule.to_openeyeand using the functionality offered by those packages.
PR #281: Due to different constraints placed on the data types allowed by external toolkits, we make our best effort to preserve
propertieswhen converting molecules to other packages, but users should be aware that no guarantee of data integrity is made. The only data format for keys and values in the property dict that we will try to support through a roundtrip to another toolkit’s Molecule object is
PR #574: Removed check that all partial charges are zero after assignment by
quacpacwhen AM1BCC used for charge assignment. This check fails erroneously for cases in which the partial charge assignments are correctly all zero, such as for
N#N. It is also an unnecessary check given that
quacpacwill reliably indicate when it has failed to assign charges.
PR #597: Energy-minimized sample systems with Parsley 1.1.0.
PR #469: When running
Topology.to_openmm, unique atom names are generated if the provided atom names are not unique (overriding any existing atom names). This uniqueness extends only to atoms in the same molecule. To disable this behavior, set the kwarg
PR #471: Closes Issue #465.
simtk.unit.Quantityobjects instead of integers. To preserve backward compatibility, the setter for
atom.formal_chargecan accept either a
simtk.unit.Quantityor an integer.
PR #601: Removes almost all of the previous
ChemicalEnvironmentAPI, since this entire module was simply copied from Chemper several years ago and has fallen behind on updates. Currently only
ChemicalEnvironment.validate, and an equivalent classmethod
ChemicalEnvironment.validate_smirksremain. Also, please comment on this GitHub issue if you HAVE been using the previous extra functionality in this module and would like us to prioritize creation of a Chemper conda package.
PR #558: Removes
TopologyMolecule.topology_particle_start_index, since the
Topologyparticle indexing system now orders
TopologyVirtualSitesafter all atoms.
TopologyMolecule.topology_virtual_site_start_indexare still available to access the appropriate values in the respective topology indexing systems.
charge_modelkeyword is now
bond_order_model. The allowed values of this keyword have changed from
PR #595: Removed functions
openforcefield.utils.utils.temporary_cdand replaced their behavior with
PR #471: Implements
Molecule.assign_partial_charges, which calls one of the newly-implemented
strict_n_conformersis a optional boolean keyword argument indicating whether an
IncorrectNumConformersErrorshould be raised if an invalid number of conformers is supplied during partial charge calculation. For example, if two conformers are supplied, but
partial_charge_method="AM1BCC"is also set, then there is no clear use for the second conformer. The previous behavior in this case was to raise a warning, and to preserve that behavior,
strict_n_conformersdefaults to a value of
PR #471: Adds keyword argument
ToolkitRegistry.call. The default value will provide the previous OpenFF Toolkit behavior, which is that the first ToolkitWrapper that can provide the requested method is called, and it either returns on success or raises an exception. This new keyword argument allows the ToolkitRegistry to ignore certain exceptions, but treat others as fatal. If
raise_exception_types = , the ToolkitRegistry will attempt to call each ToolkitWrapper that provides the requested method and if none succeeds, a single
ValueErrorwill be raised, with text listing the errors that were raised by each ToolkitWrapper.
PR #601: Adds
OpenEyeToolkitWrapper.get_tagged_smarts_connectivity, which allow the use of either toolkit for smirks/tagged smarts validation.
PR #600: Adds
ForceField.__getitem__to look up
ParameterHandlerobjects based on their string names.
PR #508: Adds
PR #472: Adds to the
The to_qcschema method accepts an extras dictionary which is passed into the validated qcelemental.models.Molecule object.
InChI was not designed as an molecule interchange format and using it as one is not recommended. Many round trip tests will fail when using this format due to a loss of information. We have also added support for fixed hydrogen layer nonstandard InChI which can help in the case of tautomers, but overall creating molecules from InChI should be avoided.
PR #529: Adds the ability to write out to XYZ files via
Molecule.to_fileBoth single frame and multiframe XYZ files are supported. Note reading from XYZ files will not be supported due to the lack of connectivity information.
PR #535: Extends the the API for the
Molecule.to_smilesto allow for the creation of cmiles identifiers through combinations of isomeric, explicit hydrogen and mapped smiles, the default settings will return isomeric explicit hydrogen smiles as expected.
Atom maps can be supplied to the properties dictionary to modify which atoms have their map index included, if no map is supplied all atoms will be mapped in the order they appear in the
PR #563: Adds
LibraryChargesfor monatomic ions.
PR #543: Adds 3 new methods to the
Moleculeclass which allow the enumeration of molecule states. These are
Enumerate protomers is currently only available through the OpenEye toolkit.
PR #573: Adds
quacpacerror output to
PR #560: Added visualization method to the the Molecule class.
PR #620: Added the ability to register parameter handlers via entry point plugins. This functionality is accessible by initializing a
PR #558: Adds tests ensuring that the new Topology particle indexing system are properly implemented, and that TopologyVirtualSites reference the correct TopologyAtoms.
PR #506: Added a test for the molecule identified in issue #513 as losing aromaticity when converted to rdkit.
PR #506: Added a verity of toolkit dependent tests for identifying rotatable bonds while ignoring the user requested types.
PR #529: Added to XYZ file coverage tests.
PR #563: Added LibraryCharges parameterization test for monatomic ions in
PR #543: Added tests to assure that state enumeration can correctly find molecules tautomers, stereoisomers and protomers when possible.
PR #573: Added test for
quacpacerror output for
PR #579: Adds regression tests to ensure RDKit can be be used to write multi-model PDB files.
PR #582: Added fractional bond order interpolation tests, tests for
PR #558: Fixes a bug where
TopologyVirtualSite.atomswould not correctly apply
TopologyMoleculeatom ordering on top of the reference molecule ordering, in cases where the same molecule appears multiple times, but in a different order, in the same Topology.
Issue #460: Creates unique atom names in
Topology.to_openmmif the existing ones are not unique. The lack of unique atom names had been causing problems in workflows involving downstream tools that expect unique atom names.
Issue #448: We can now make molecules from mapped smiles using
Molecule.from_mapped_smileswhere the order will correspond to the indeing used in the smiles. Molecules can also be re-indexed at any time using the
Issue #412: We can now instance the
Molecule.from_mapped_smiles. This resolves an issue caused by RDKit considering atom map indices to be a distinguishing feature of an atom, which led to erroneous definition of chirality (as otherwise symmetric substituents would be seen as different). We anticipate that this will reduce the number of times you need to type
allow_undefined_stereo=Truewhen processing molecules that do not actually contain stereochemistrty.
Issue #491: We can now parse large molecules without hitting a match limit cap.
Issue #474: We can now convert molecules to InChI and InChIKey and from InChI.
PR #591 and PR #533: Adds an example notebook and utility to compute conformer energies. This example is made to be reverse-compatible with the 0.6.0 OpenFF Toolkit release.
0.6.0 - Library Charges¶
This release adds support for a new SMIRKS-based charge assignment method, Library Charges. The addition of more charge assignment methods opens the door for new types of experimentation, but also introduces several complex behaviors and failure modes. Accordingly, we have made changes to the charge assignment infrastructure to check for cases when partial charges do not sum to the formal charge of the molecule, or when no charge assignment method is able to generate charges for a molecule. More detailed explanation of the new errors that may be raised and keywords for overriding them are in the “Behavior Changed” section below.
With this release, we update
test_forcefields/tip3p.offxml to be a working example of assigning LibraryCharges.
However, we do not provide any force field files to assign protein residue
If you are interested in translating an existing protein FF to SMIRNOFF format or developing a new one, please
feel free to contact us on the Issue tracker or open a
PR #433: Closes Issue #25 by adding initial support for the LibraryCharges tag in the SMIRNOFF specification using
LibraryChargeHandler. For a molecule to have charges assigned using Library Charges, all of its atoms must be covered by at least one
LibraryCharge. If an atom is covered by multiple
LibraryCharges, then the last
LibraryChargematched will be applied (per the hierarchy rules in the SMIRNOFF format).
This functionality is thus able to apply per-residue charges similar to those in traditional protein force fields. At this time, there is no concept of “residues” or “fragments” during parametrization, so it is not possible to assign charges to some atoms in a molecule using
LibraryCharges, but calculate charges for other atoms in the same molecule using a different method. To assign charges to a protein, LibraryCharges SMARTS must be provided for the residues and protonation states in the molecule, as well as for any capping groups and post-translational modifications that are present.
It is valid for
LibraryChargeSMARTS to partially overlap one another. For example, a molecule consisting of atoms
A-B-Cconnected by single bonds could be matched by a SMIRNOFF
LibraryChargessection containing two
B-C. If listed in that order, the molecule would be assigned the
Acharge from the
LibraryChargeelement and the
Ccharges from the
B-Celement. In testing, these types of partial overlaps were found to frequently be sources of undesired behavior, so it is recommended that users define whole-molecule
LibraryChargeSMARTS whenever possible.
PR #455: Addresses Issue #393 by adding
ParameterType.attribute_is_cosmetic, which return True if the provided attribute name is defined for the queried object but does not correspond to an allowed value in the SMIRNOFF spec.
PR #433: If a molecule can not be assigned charges by any charge-assignment method, an
openforcefield.typing.engines.smirnoff.parameters.UnassignedMoleculeChargeExceptionwill be raised. Previously, creating a system without either
charge_from_moleculeskeyword argument to
ForceField.create_openmm_systemwould produce a system where the molecule has zero charge on all atoms. However, given that we will soon be adding more options for charge assignment, it is important that failures not be silent. Molecules with zero charge can still be produced by setting the
Molecule.partial_chargesarray to be all zeroes, and including the molecule in the
charge_from_moleculeskeyword argument to
PR #433: Due to risks introduced by permitting charge assignment using partially-overlapping
LibraryCharges, the toolkit will now raise a
openforcefield.typing.engines.smirnoff.parameters.NonIntegralMoleculeChargeExceptionif the sum of partial charges on a molecule are found to be more than 0.01 elementary charge units different than the molecule’s formal charge. This exception can be overridden by providing the
allow_nonintegral_charges=Truekeyword argument to
PR #430: Added test for Wiberg Bond Order implemented in OpenEye Toolkits. Molecules taken from DOI:10.5281/zenodo.3405489 . Added by Sukanya Sasmal.
PR #569: Added round-trip tests for more serialization formats (dict, YAML, TOML, JSON, BSON, messagepack, pickle). Note that some are unsupported, but the tests raise the appropriate error.
PR #431: Fixes an issue where
ToolkitWrapperobjects would improperly search for functionality in the
GLOBAL_TOOLKIT_REGISTRY, even though a specific
ToolkitRegistrywas requested for an operation.
0.5.1 - Adding the parameter coverage example notebook¶
This release contains a new notebook example, check_parameter_coverage.ipynb, which loads sets of molecules, checks whether they are parameterizable, and generates reports of chemical motifs that are not. It also fixes several simple issues, improves warnings and docstring text, and removes unused files.
The parameter coverage example notebook goes hand-in-hand with the release candidate of our initial force field, openff-1.0.0-RC1.offxml , which will be temporarily available until the official force field release is made in October. Our goal in publishing this notebook alongside our first major refitting is to allow interested users to check whether there is parameter coverage for their molecules of interest. If the force field is unable to parameterize a molecule, this notebook will generate reports of the specific chemistry that is not covered. We understand that many organizations in our field have restrictions about sharing specific molecules, and the outputs from this notebook can easily be cropped to communicate unparameterizable chemistry without revealing the full structure.
The force field release candidate is in our new refit force field package, openforcefields. This package is now a part of the Open Force Field Toolkit conda recipe, along with the original smirnoff99Frosst line of force fields.
openforcefields conda package is installed, you can load the release candidate using:
ff = ForceField('openff-1.0.0-RC1.offxml')
The release candidate will be removed when the official force field,
openff-1.0.0.offxml, is released in early October.
Complete details about this release are below.
PR #419: Unassigned valence parameter exceptions now include a list of tuples of
TopologyAtomwhich were unable to be parameterized (
exception.unassigned_topology_atom_tuples) and the class of the
ParameterHandlerthat raised the exception (
PR #425: Implements Trevor Gokey’s suggestion from Issue #411, which enables pickling of
ParameterHandlers. Note that, while XML representations of ``ForceField``s are stable and conform to the SMIRNOFF specification, the pickled ``ForceField``s that this functionality enables are not guaranteed to be compatible with future toolkit versions.
Improved documentation and warnings¶
PR #425: Addresses Issue #410, by explicitly having toolkit warnings print
Warning:at the beginning of each warning, and adding clearer language to the warning produced when the OpenEye Toolkits can not be loaded.
0.5.0 - GBSA support and quality-of-life improvements¶
This release adds support for the
GBSA tag in the SMIRNOFF specification.
OBC2 models (corresponding to AMBER keywords
5, respectively) are supported, with the
OBC2 implementation being
the most flexible. Unfortunately, systems produced
using these keywords are not yet transferable to other simulation packages via ParmEd, so users are restricted
to using OpenMM to simulate systems with GBSA.
OFFXML files containing GBSA parameter definitions are available,
and can be loaded in addition to existing parameter sets (for example, with the command
A manifest of new SMIRNOFF-format GBSA files is below.
Several other user-facing improvements have been added, including easier access to indexed attributes,
which are now accessible as
torsion.k2, etc. (the previous access method
torsion.k still works as well). More details of the new features and several bugfixes are listed below.
PR #363: Implements
GBSAHandler, which supports the GBSA tag in the SMIRNOFF specification. Currently, only GBSAHandlers with
gb_model="OBC2"support setting non-default values for the
5.4*calories/mole/angstroms**2), though users can zero the SA term for
HCTmodels by setting
sa_model="None". No model currently supports setting
solvent_radiusto any value other than
1.4*angstroms. Files containing experimental SMIRNOFF-format implementations of
OBC2are included with this release (see below). Additional details of these models, including literature references, are available on the SMIRNOFF specification page.
The current release of ParmEd can not transfer GBSA models produced by the Open Force Field Toolkit to other simulation packages. These GBSA forces are currently only computable using OpenMM.
PR #394: Include element and atom name in error output when there are missing valence parameters during molecule parameterization.
PR #385: Fixes Issue #346 by having
OpenEyeToolkitWrapper.compute_partial_charges_am1bccfall back to using standard AM1-BCC if AM1-BCC ELF10 charge generation raises an error about “trans COOH conformers”
PR #400: Makes link-checking tests retry three times before failing.
PR #363: Adds
test_forcefields/GBSA_OBC2-1.0.offxml, which are experimental implementations of GBSA models. These are primarily used in validation tests against OpenMM’s models, and their version numbers will increment if bugfixes are necessary.
0.4.1 - Bugfix Release¶
This update fixes several toolkit bugs that have been reported by the community. Details of these bugfixes are provided below.
It also refactors how
store their attributes, by introducing
These new attribute-handling classes provide a consistent backend which should simplify manipulation of parameters
and implementation of new handlers.
PR #329: Fixed a bug where the two
lengthwere treated as indexed attributes. (
lengthvalues that correspond to specific bond orders will be indexed under
k_bondorder2, etc when implemented in the future)
PR #329: Fixed a bug that allowed setting indexed attributes to single values instead of strictly lists.
PR #351: Fixes a bug where a molecule which previously generated a SMILES using one cheminformatics toolkit returns the same SMILES, even though a different toolkit (which would generate a different SMILES for the molecule) is explicitly called.
PR #354: Fixes the error message that is printed if an unexpected parameter attribute is found while loading data into a
ForceField(now instructs users to specify
PR #364: Fixes Issue #362 by modifying
RDKitToolkitWrapper.from_smilesto make implicit hydrogens explicit before molecule creation. These functions also now raise an error if the optional keyword
hydrogens_are_explicit=Truebut the SMILES are interpreted by the backend cheminformatic toolkit as having implicit hydrogens.
PR #371: Fixes error when reading early SMIRNOFF 0.1 spec files enclosed by a top-level
SMIRFF tag is present only in legacy files.
Since developing a formal specification, the only acceptable top-level tag value in a SMIRNOFF data structure is
Force fields added¶
PR #368: Temporarily adds
test_forcefields/smirnoff99frosst_experimental.offxmlto address hierarchy problems, redundancies, SMIRKS pattern typos etc., as documented in issue #367. Will ultimately be propagated to an updated forcefield in the
PR #371: Adds
test_forcefields/smirff99Frosst_reference_0_1_spec.offxml, a SMIRNOFF 0.1 spec file enclosed by the legacy
SMIRFFtag. This file is used in backwards-compatibility testing.
0.4.0 - Performance optimizations and support for SMIRNOFF 0.3 specification¶
This update contains performance enhancements that significantly reduce the time to create OpenMM systems for topologies containing many molecules via
This update also introduces the SMIRNOFF 0.3 specification. The spec update is the result of discussions about how to handle the evolution of data and parameter types as further functional forms are added to the SMIRNOFF spec.
We provide methods to convert SMIRNOFF 0.1 and 0.2 forcefields written with the XML serialization (
.offxml) to the SMIRNOFF 0.3 specification.
These methods are called automatically when loading a serialized SMIRNOFF data representation written in the 0.1 or 0.2 specification.
This functionality allows the toolkit to continue to read files containing SMIRNOFF 0.2 spec force fields, and also implements backwards-compatibility for SMIRNOFF 0.1 spec force fields.
The SMIRNOFF 0.1 spec did not contain fields for several energy-determining parameters that are exposed in later SMIRNOFF specs. Thus, when reading SMIRNOFF 0.1 spec data, the toolkit must make assumptions about the values that should be added for the newly-required fields. The values that are added include 1-2, 1-3 and 1-5 scaling factors, cutoffs, and long-range treatments for nonbonded interactions. Each assumption is printed as a warning during the conversion process. Please carefully review the warning messages to ensure that the conversion is providing your desired behavior.
The SMIRNOFF 0.3 spec introduces versioning for each individual parameter section, allowing asynchronous updates to the features of each parameter class. The top-level
SMIRNOFFtag, containing information like
Date, still has a version (currently 0.3). But, to allow for independent development of individual parameter types, each section (such as
Angles, etc) now has its own version as well (currently all 0.3).
All units are now stored in expressions with their corresponding values. For example, distances are now stored as
1.526*angstrom, instead of storing the unit separately in the section header.
The current allowed value of the
ImproperTorsionstags is no longer
charmm, but is rather
k*(1+cos(periodicity*theta-phase)). It was pointed out to us that CHARMM-style torsions deviate from this formula when the periodicity of a torsion term is 0, and we do not intend to reproduce that behavior.
SMIRNOFF spec documentation has been updated with tables of keywords and their defaults for each parameter section and parameter type. These tables will track the allowed keywords and default behavior as updated versions of individual parameter sections are released.
Performance improvements and bugfixes¶
PR #311: Several new experimental functions.
convert_0_2_smirnoff_to_0_3, which takes a SMIRNOFF 0.2-spec data dict, and updates it to 0.3. This function is called automatically when creating a
ForceFieldfrom a SMIRNOFF 0.2 spec OFFXML file.
convert_0_1_smirnoff_to_0_2, which takes a SMIRNOFF 0.1-spec data dict, and updates it to 0.2. This function is called automatically when creating a
ForceFieldfrom a SMIRNOFF 0.1 spec OFFXML file.
NOTE: The format of the “SMIRNOFF data dict” above is likely to change significantly in the future. Users that require a stable serialized ForceField object should use the output of
delete_cosmetic_attributefunctions. Once created, cosmetic attributes can be accessed and modified as attributes of the underlying object (eg.
ParameterType.my_cosmetic_attrib = 'blue') These functions are experimental, and we are interested in feedback on how cosmetic attribute handling could be improved. (See Issue #338) Note that if a new cosmetic attribute is added to an object without using these functions, it will not be recognized by the toolkit and will not be written out during serialization.
Values for the top-level
Datetags are now kept during SMIRNOFF data I/O. If multiple data sources containing these fields are read, the values are concatenated using “AND” as a separator.
The scripts in
utilities/convert_frosstare now deprecated. This functionality is important for provenance and will be migrated to the
openforcefield/smirnoff99Frosstrepository in the coming weeks.
0.3.0 - API Improvements¶
Several improvements and changes to public API.
PR #292: Implement
PR #322: Install directories for the lookup of OFFXML files through the entry point group
ForceFieldclass doesn’t search in the
data/forcefield/folder anymore (now renamed
data/test_forcefields/), but only in
PR #327: Fix units in tip3p.offxml (note that this file is still not loadable by current toolkit)
PR #325: Fix solvent box for provided test system to resolve periodic clashes.
PR #325: Add informative message containing Hill formula when a molecule can’t be matched in
PR #325: Provide warning or error message as appropriate when a molecule is missing stereochemistry.
PR #316: Fix formatting issues in GBSA section of SMIRNOFF spec
PR #308: Cache molecule SMILES to improve system creation speed
PR #306: Allow single-atom molecules with all zero coordinates to be converted to OE/RDK mols
PR #313: Fix issue where constraints are applied twice to constrained bonds
0.2.2 - Bugfix release¶
This release modifies an example to show how to parameterize a solvated system, cleans up backend code, and makes several improvements to the README.
0.2.1 - Bugfix release¶
This release features various documentation fixes, minor bugfixes, and code cleanup.
0.2.0 - Initial RDKit support¶
This version of the toolkit introduces many new features on the way to a 1.0.0 release.
Major overhaul, resulting in the creation of the SMIRNOFF 0.2 specification and its XML representation
Updated API and infrastructure for reference SMIRNOFF
Implementation of modular
ParameterHandlerclasses which process the topology to add all necessary forces to the system.
Implementation of modular
ParameterIOHandlerclasses for reading/writing different serialized SMIRNOFF forcefield representations
Topologyclasses for representing molecules and biomolecular systems
ToolkitWrapperinterface to RDKit, OpenEye, and AmberTools toolkits, managed by
API improvements to more closely follow PEP8 guidelines
Improved documentation and examples
This is an early preview release of the toolkit that matches the functionality described in the preprint describing the SMIRNOFF v0.1 force field format: [DOI].
This release features additional documentation, code comments, and support for automated testing.
Treatment of improper torsions¶
A significant (though currently unused) problem in handling of improper torsions was corrected. Previously, non-planar impropers did not behave correctly, as six-fold impropers have two potential chiralities. To remedy this, SMIRNOFF impropers are now implemented as three-fold impropers with consistent chirality. However, current force fields in the SMIRNOFF format had no non-planar impropers, so this change is mainly aimed at future work.