openforcefield.topology.Molecule

class openforcefield.topology.Molecule(*args, **kwargs)[source]

Mutable chemical representation of a molecule, such as a small molecule or biopolymer.

Todo

What other API calls would be useful for supporting biopolymers as small molecules? Perhaps iterating over chains and residues?

Examples

Create a molecule from an sdf file

>>> from openforcefield.utils import get_data_filename
>>> sdf_filepath = get_data_filename('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)

Convert to OpenEye OEMol object

>>> oemol = molecule.to_openeye()

Create a molecule from an OpenEye molecule

>>> molecule = Molecule.from_openeye(oemol)

Convert to RDKit Mol object

>>> rdmol = molecule.to_rdkit()

Create a molecule from an RDKit molecule

>>> molecule = Molecule.from_rdkit(rdmol)

Create a molecule from IUPAC name (requires the OpenEye toolkit)

>>> molecule = Molecule.from_iupac('imatinib')

Create a molecule from SMILES

>>> molecule = Molecule.from_smiles('Cc1ccccc1')

Warning

This API is experimental and subject to change.

Attributes:
angles

Get an iterator over all i-j-k angles.

atoms

Iterate over all Atom objects.

bonds

Iterate over all Bond objects.

conformers

Iterate over all conformers in this molecule.

impropers

Iterate over all proper torsions in the molecule

n_angles

int: number of angles in the Molecule.

n_atoms

The number of Atom objects.

n_bonds

The number of Bond objects.

n_conformers

Iterate over all Atom objects.

n_impropers

int: number of improper torsions in the Molecule.

n_particles

The number of Particle objects, which corresponds to how many positions must be used.

n_propers

int: number of proper torsions in the Molecule.

n_virtual_sites

The number of VirtualSite objects.

name

The name (or title) of the molecule

partial_charges

Returns the partial charges (if present) on the molecule

particles

Iterate over all Particle objects.

propers

Iterate over all proper torsions in the molecule

properties

The properties dictionary of the molecule

torsions

Get an iterator over all i-j-k-l torsions.

total_charge

Return the total charge on the molecule

virtual_sites

Iterate over all VirtualSite objects.

Methods

add_atom(atomic_number, formal_charge, …) Add an atom
add_bond(atom1, atom2, bond_order, is_aromatic) Add a bond between two specified atom indices
add_bond_charge_virtual_site(atoms, distance) Create a bond charge-type virtual site, in which the location of the charge is specified by the position of two atoms.
add_conformer(coordinates) # TODO: Should this not be public? Adds a conformer of the molecule
add_divalent_lone_pair_virtual_site(atoms, …) Create a divalent lone pair-type virtual site, in which the location of the charge is specified by the position of three atoms.
add_monovalent_lone_pair_virtual_site(atoms, …) Create a bond charge-type virtual site, in which the location of the charge is specified by the position of three atoms.
add_trivalent_lone_pair_virtual_site(atoms, …) Create a trivalent lone pair-type virtual site, in which the location of the charge is specified by the position of four atoms.
chemical_environment_matches(query[, …]) Retrieve all matches for a given chemical environment query.
compute_partial_charges([toolkit_registry]) Warning! Not Implemented! Calculate partial atomic charges for this molecule using an underlying toolkit
compute_partial_charges_am1bcc([…]) Calculate partial atomic charges for this molecule using AM1-BCC run by an underlying toolkit
compute_wiberg_bond_orders([charge_model, …]) Calculate wiberg bond orders for this molecule using an underlying toolkit
from_bson(serialized) Instantiate an object from a BSON serialized representation.
from_dict(molecule_dict) Create a new Molecule from a dictionary representation
from_file(filename[, file_format, …]) Create one or more molecules from a file
from_iupac(iupac_name, **kwargs) Generate a molecule from IUPAC or common name
from_json(serialized) Instantiate an object from a JSON serialized representation.
from_messagepack(serialized) Instantiate an object from a MessagePack serialized representation.
from_openeye(oemol[, allow_undefined_stereo]) Create a Molecule from an OpenEye molecule.
from_pickle(serialized) Instantiate an object from a pickle serialized representation.
from_rdkit(rdmol[, allow_undefined_stereo]) Create a Molecule from an RDKit molecule.
from_smiles(smiles[, …]) Construct a Molecule from a SMILES representation
from_toml(serialized) Instantiate an object from a TOML serialized representation.
from_topology(topology) Return a Molecule representation of an openforcefield Topology containing a single Molecule object.
from_xml(serialized) Instantiate an object from an XML serialized representation.
from_yaml(serialized) Instantiate from a YAML serialized representation.
generate_conformers([toolkit_registry, …]) Generate conformers for this molecule using an underlying toolkit
get_fractional_bond_orders([method, …]) Get fractional bond orders.
is_isomorphic(other[, …]) Determines whether the molecules are isomorphic by comparing their graphs.
to_bson() Return a BSON serialized representation.
to_dict() Return a dictionary representation of the molecule.
to_file(outfile, outfile_format[, …]) Write the current molecule to a file or file-like object
to_iupac() Generate IUPAC name from Molecule
to_json([indent]) Return a JSON serialized representation.
to_messagepack() Return a MessagePack representation.
to_networkx() Generate a NetworkX undirected graph from the Molecule.
to_openeye([aromaticity_model]) Create an OpenEye molecule
to_pickle() Return a pickle serialized representation.
to_rdkit([aromaticity_model]) Create an RDKit molecule
to_smiles([toolkit_registry]) Return a canonical isomeric SMILES representation of the current molecule
to_toml() Return a TOML serialized representation.
to_topology() Return an openforcefield Topology representation containing one copy of this molecule
to_xml([indent]) Return an XML representation.
to_yaml() Return a YAML serialized representation.
__init__(*args, **kwargs)[source]

Create a new Molecule object

Parameters:
other : optional, default=None

If specified, attempt to construct a copy of the Molecule from the specified object. This can be any one of the following:

  • a Molecule object
  • a file that can be used to construct a Molecule object
  • an openeye.oechem.OEMol
  • an rdkit.Chem.rdchem.Mol
  • a serialized Molecule object

Examples

Create an empty molecule:

>>> empty_molecule = Molecule()

Create a molecule from a file that can be used to construct a molecule, using either a filename or file-like object:

>>> from openforcefield.utils import get_data_filename
>>> sdf_filepath = get_data_filename('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> molecule = Molecule(open(sdf_filepath, 'r'), file_format='sdf')
>>> import gzip
>>> mol2_gz_filepath = get_data_filename('molecules/toluene.mol2.gz')
>>> molecule = Molecule(gzip.GzipFile(mol2_gz_filepath, 'r'), file_format='mol2')

Create a molecule from another molecule:

>>> molecule_copy = Molecule(molecule)

Convert to OpenEye OEMol object

>>> oemol = molecule.to_openeye()

Create a molecule from an OpenEye molecule:

>>> molecule = Molecule(oemol)

Convert to RDKit Mol object

>>> rdmol = molecule.to_rdkit()

Create a molecule from an RDKit molecule:

>>> molecule = Molecule(rdmol)

Create a molecule from a serialized molecule object:

>>> serialized_molecule = molecule.__getstate__()
>>> molecule_copy = Molecule(serialized_molecule)

Todo

  • If a filename or file-like object is specified but the file contains more than one molecule, what is the

proper behavior? Read just the first molecule, or raise an exception if more than one molecule is found?

  • Should we also support SMILES strings or IUPAC names for other?

Methods

__init__(*args, **kwargs) Create a new Molecule object
add_atom(atomic_number, formal_charge, …) Add an atom
add_bond(atom1, atom2, bond_order, is_aromatic) Add a bond between two specified atom indices
add_bond_charge_virtual_site(atoms, distance) Create a bond charge-type virtual site, in which the location of the charge is specified by the position of two atoms.
add_conformer(coordinates) # TODO: Should this not be public? Adds a conformer of the molecule
add_divalent_lone_pair_virtual_site(atoms, …) Create a divalent lone pair-type virtual site, in which the location of the charge is specified by the position of three atoms.
add_monovalent_lone_pair_virtual_site(atoms, …) Create a bond charge-type virtual site, in which the location of the charge is specified by the position of three atoms.
add_trivalent_lone_pair_virtual_site(atoms, …) Create a trivalent lone pair-type virtual site, in which the location of the charge is specified by the position of four atoms.
chemical_environment_matches(query[, …]) Retrieve all matches for a given chemical environment query.
compute_partial_charges([toolkit_registry]) Warning! Not Implemented! Calculate partial atomic charges for this molecule using an underlying toolkit
compute_partial_charges_am1bcc([…]) Calculate partial atomic charges for this molecule using AM1-BCC run by an underlying toolkit
compute_wiberg_bond_orders([charge_model, …]) Calculate wiberg bond orders for this molecule using an underlying toolkit
from_bson(serialized) Instantiate an object from a BSON serialized representation.
from_dict(molecule_dict) Create a new Molecule from a dictionary representation
from_file(filename[, file_format, …]) Create one or more molecules from a file
from_iupac(iupac_name, **kwargs) Generate a molecule from IUPAC or common name
from_json(serialized) Instantiate an object from a JSON serialized representation.
from_messagepack(serialized) Instantiate an object from a MessagePack serialized representation.
from_openeye(oemol[, allow_undefined_stereo]) Create a Molecule from an OpenEye molecule.
from_pickle(serialized) Instantiate an object from a pickle serialized representation.
from_rdkit(rdmol[, allow_undefined_stereo]) Create a Molecule from an RDKit molecule.
from_smiles(smiles[, …]) Construct a Molecule from a SMILES representation
from_toml(serialized) Instantiate an object from a TOML serialized representation.
from_topology(topology) Return a Molecule representation of an openforcefield Topology containing a single Molecule object.
from_xml(serialized) Instantiate an object from an XML serialized representation.
from_yaml(serialized) Instantiate from a YAML serialized representation.
generate_conformers([toolkit_registry, …]) Generate conformers for this molecule using an underlying toolkit
get_fractional_bond_orders([method, …]) Get fractional bond orders.
is_isomorphic(other[, …]) Determines whether the molecules are isomorphic by comparing their graphs.
to_bson() Return a BSON serialized representation.
to_dict() Return a dictionary representation of the molecule.
to_file(outfile, outfile_format[, …]) Write the current molecule to a file or file-like object
to_iupac() Generate IUPAC name from Molecule
to_json([indent]) Return a JSON serialized representation.
to_messagepack() Return a MessagePack representation.
to_networkx() Generate a NetworkX undirected graph from the Molecule.
to_openeye([aromaticity_model]) Create an OpenEye molecule
to_pickle() Return a pickle serialized representation.
to_rdkit([aromaticity_model]) Create an RDKit molecule
to_smiles([toolkit_registry]) Return a canonical isomeric SMILES representation of the current molecule
to_toml() Return a TOML serialized representation.
to_topology() Return an openforcefield Topology representation containing one copy of this molecule
to_xml([indent]) Return an XML representation.
to_yaml() Return a YAML serialized representation.

Attributes

angles Get an iterator over all i-j-k angles.
atoms Iterate over all Atom objects.
bonds Iterate over all Bond objects.
conformers Iterate over all conformers in this molecule.
impropers Iterate over all proper torsions in the molecule
n_angles int: number of angles in the Molecule.
n_atoms The number of Atom objects.
n_bonds The number of Bond objects.
n_conformers Iterate over all Atom objects.
n_impropers int: number of improper torsions in the Molecule.
n_particles The number of Particle objects, which corresponds to how many positions must be used.
n_propers int: number of proper torsions in the Molecule.
n_virtual_sites The number of VirtualSite objects.
name The name (or title) of the molecule
partial_charges Returns the partial charges (if present) on the molecule
particles Iterate over all Particle objects.
propers Iterate over all proper torsions in the molecule
properties The properties dictionary of the molecule
torsions Get an iterator over all i-j-k-l torsions.
total_charge Return the total charge on the molecule
virtual_sites Iterate over all VirtualSite objects.
angles

Get an iterator over all i-j-k angles.

atoms

Iterate over all Atom objects.

bonds

Iterate over all Bond objects.

chemical_environment_matches(query, toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Retrieve all matches for a given chemical environment query.

Parameters:
query : str or ChemicalEnvironment

SMARTS string (with one or more tagged atoms) or ChemicalEnvironment query Query will internally be resolved to SMIRKS using query.asSMIRKS() if it has an .asSMIRKS method.

toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper, optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for chemical environment matches

Returns:
matches : list of Atom tuples

A list of all matching Atom tuples

Examples

Retrieve all the carbon-carbon bond matches in a molecule

>>> molecule = Molecule.from_iupac('imatinib')
>>> matches = molecule.chemical_environment_matches('[#6X3:1]~[#6X3:2]')

Todo

  • Do we want to generalize query to allow other kinds of queries, such as mdtraj DSL, pymol selections, atom index slices, etc? We could call it topology.matches(query) instead of chemical_environment_matches
compute_partial_charges(toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Warning! Not Implemented! Calculate partial atomic charges for this molecule using an underlying toolkit

Parameters:
quantum_chemical_method : string, default=’AM1-BCC’

The quantum chemical method to use for partial charge calculation.

partial_charge_method : string, default=’None’

The partial charge calculation method to use for partial charge calculation.

toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

Raises:
InvalidToolkitError

If an invalid object is passed as the toolkit_registry parameter

Examples

molecule = Molecule.from_smiles(‘CCCCCC’) molecule.generate_conformers() molecule.compute_partial_charges()

compute_partial_charges_am1bcc(toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Calculate partial atomic charges for this molecule using AM1-BCC run by an underlying toolkit

Parameters:
toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for the calculation

Raises:
InvalidToolkitError

If an invalid object is passed as the toolkit_registry parameter

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.generate_conformers()
>>> molecule.compute_partial_charges_am1bcc()
compute_wiberg_bond_orders(charge_model=None, toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Calculate wiberg bond orders for this molecule using an underlying toolkit

Parameters:
toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

charge_model : string, optional

The charge model to use for partial charge calculation

Examples
——–
>>> molecule = Molecule.from_smiles(‘CCCCCC’)
>>> molecule.generate_conformers()
>>> molecule.compute_wiberg_bond_orders()
Raises:
InvalidToolkitError

If an invalid object is passed as the toolkit_registry parameter

conformers

Iterate over all conformers in this molecule.

classmethod from_bson(serialized)

Instantiate an object from a BSON serialized representation.

Specification: http://bsonspec.org/

Parameters:
serialized : bytes

A BSON serialized representation of the object

Returns:
instance : cls

An instantiated object

classmethod from_dict(molecule_dict)

Create a new Molecule from a dictionary representation

Parameters:
molecule_dict : OrderedDict

A dictionary representation of the molecule.

Returns:
molecule : Molecule

A Molecule created from the dictionary representation

static from_file(filename, file_format=None, toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>, allow_undefined_stereo=False)

Create one or more molecules from a file

Todo

  • Extend this to also include some form of .offmol Open Force Field Molecule format?
  • Generalize this to also include file-like objects?
Parameters:
filename : str or file-like object

The name of the file or file-like object to stream one or more molecules from.

file_format : str, optional, default=None

Format specifier, usually file suffix (eg. ‘MOL2’, ‘SMI’) Note that not all toolkits support all formats. Check ToolkitWrapper.toolkit_file_read_formats for your loaded toolkits for details.

toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper,
optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for file loading. If a Toolkit is passed, only the highest-precedence toolkit is used

allow_undefined_stereo : bool, default=False

If false, raises an exception if oemol contains undefined stereochemistry.

Returns:
molecules : Molecule or list of Molecules

If there is a single molecule in the file, a Molecule is returned; otherwise, a list of Molecule objects is returned.

Examples

>>> from openforcefield.tests.utils import get_monomer_mol2file
>>> mol2_filename = get_monomer_mol2file('cyclohexane')
>>> molecule = Molecule.from_file(mol2_filename)
classmethod from_iupac(iupac_name, **kwargs)

Generate a molecule from IUPAC or common name

Parameters:
iupac_name : str

IUPAC name of molecule to be generated

allow_undefined_stereo : bool, default=False

If false, raises an exception if molecule contains undefined stereochemistry.

Returns:
molecule : Molecule

The resulting molecule with position

.. note :: This method requires the OpenEye toolkit to be installed.

Examples

Create a molecule from a common name

>>> molecule = Molecule.from_iupac('4-[(4-methylpiperazin-1-yl)methyl]-N-(4-methyl-3-{[4-(pyridin-3-yl)pyrimidin-2-yl]amino}phenyl)benzamide')

Create a molecule from a common name

>>> molecule = Molecule.from_iupac('imatinib')
classmethod from_json(serialized)

Instantiate an object from a JSON serialized representation.

Specification: https://www.json.org/

Parameters:
serialized : str

A JSON serialized representation of the object

Returns:
instance : cls

An instantiated object

classmethod from_messagepack(serialized)

Instantiate an object from a MessagePack serialized representation.

Specification: https://msgpack.org/index.html

Parameters:
serialized : bytes

A MessagePack-encoded bytes serialized representation

Returns:
instance : cls

Instantiated object.

static from_openeye(oemol, allow_undefined_stereo=False)

Create a Molecule from an OpenEye molecule.

Requires the OpenEye toolkit to be installed.

Parameters:
oemol : openeye.oechem.OEMol

An OpenEye molecule

allow_undefined_stereo : bool, default=False

If false, raises an exception if oemol contains undefined stereochemistry.

Returns:
molecule : openforcefield.topology.Molecule

An openforcefield molecule

Examples

Create a Molecule from an OpenEye OEMol

>>> from openeye import oechem
>>> from openforcefield.tests.utils import get_data_filename
>>> ifs = oechem.oemolistream(get_data_filename('systems/monomers/ethanol.mol2'))
>>> oemols = list(ifs.GetOEGraphMols())
>>> molecule = Molecule.from_openeye(oemols[0])
classmethod from_pickle(serialized)

Instantiate an object from a pickle serialized representation.

Warning

This is not recommended for safe, stable storage since the pickle specification may change between Python versions.

Parameters:
serialized : str

A pickled representation of the object

Returns:
instance : cls

An instantiated object

static from_rdkit(rdmol, allow_undefined_stereo=False)

Create a Molecule from an RDKit molecule.

Requires the RDKit to be installed.

Parameters:
rdmol : rkit.RDMol

An RDKit molecule

allow_undefined_stereo : bool, default=False

If false, raises an exception if oemol contains undefined stereochemistry.

Returns:
molecule : openforcefield.Molecule

An openforcefield molecule

Examples

Create a molecule from an RDKit molecule

>>> from rdkit import Chem
>>> from openforcefield.tests.utils import get_data_filename
>>> rdmol = Chem.MolFromMolFile(get_data_filename('systems/monomers/ethanol.sdf'))
>>> molecule = Molecule.from_rdkit(rdmol)
static from_smiles(smiles, hydrogens_are_explicit=False, toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Construct a Molecule from a SMILES representation

Parameters:
smiles : str

The SMILES representation of the molecule.

hydrogens_are_explicit : bool, default = False

If False, the cheminformatics toolkit will perform hydrogen addition

toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

Returns:
molecule : openforcefield.topology.Molecule

Examples

>>> molecule = Molecule.from_smiles('Cc1ccccc1')
classmethod from_toml(serialized)

Instantiate an object from a TOML serialized representation.

Specification: https://github.com/toml-lang/toml

Parameters:
serlialized : str

A TOML serialized representation of the object

Returns:
instance : cls

An instantiated object

static from_topology(topology)

Return a Molecule representation of an openforcefield Topology containing a single Molecule object.

Parameters:
topology : openforcefield.topology.Topology

The Topology object containing a single Molecule object. Note that OpenMM and MDTraj Topology objects are not supported.

Returns:
molecule : openforcefield.topology.Molecule

The Molecule object in the topology

Raises:
ValueError

If the topology does not contain exactly one molecule.

Examples

Create a molecule from a Topology object that contains exactly one molecule

>>> molecule = Molecule.from_topology(topology)  # doctest: +SKIP
classmethod from_xml(serialized)

Instantiate an object from an XML serialized representation.

Specification: https://www.w3.org/XML/

Parameters:
serialized : bytes

An XML serialized representation

Returns:
instance : cls

Instantiated object.

classmethod from_yaml(serialized)

Instantiate from a YAML serialized representation.

Specification: http://yaml.org/

Parameters:
serialized : str

A YAML serialized representation of the object

Returns:
instance : cls

Instantiated object

generate_conformers(toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>, num_conformers=10, clear_existing=True)

Generate conformers for this molecule using an underlying toolkit

Parameters:
toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

num_conformers : int, default=1

The maximum number of conformers to produce

clear_existing : bool, default=True

Whether to overwrite existing conformers for the molecule

Raises:
InvalidToolkitError

If an invalid object is passed as the toolkit_registry parameter

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.generate_conformers()
get_fractional_bond_orders(method='Wiberg', toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Get fractional bond orders.

method : str, optional, default=’Wiberg’
The name of the charge method to use. Options are: * ‘Wiberg’ : Wiberg bond order
toolkit_registry : openforcefield.utils.toolkits ToolkitRegistry
The toolkit registry to use for molecule operations

Examples

Get fractional Wiberg bond orders

>>> molecule = Molecule.from_iupac('imatinib')
>>> molecule.generate_conformers()
>>> fractional_bond_orders = molecule.get_fractional_bond_orders(method='Wiberg')

Todo

  • Is it OK that the Molecule object does not store geometry, but will create it using openeye.omega or rdkit?
  • Should this method assign fractional bond orders to the Bond``s in the molecule, a separate ``bond_orders molecule property, or just return the array of bond orders?
  • How do we add enough flexibility to specify the toolkit and optional parameters, such as: oequacpac.OEAssignPartialCharges(charged_copy, getattr(oequacpac, 'OECharges_AM1BCCSym'), False, False)
  • Generalize to allow user to specify both QM method and bond order computation approach (e.g. AM1 and Wiberg)
impropers

Iterate over all proper torsions in the molecule

Todo

  • Do we need to return a Torsion object that collects information about fractional bond orders?
is_isomorphic(other, compare_atom_stereochemistry=True, compare_bond_stereochemistry=True)

Determines whether the molecules are isomorphic by comparing their graphs.

Parameters:
other : an openforcefield.topology.molecule.FrozenMolecule

The molecule to test for isomorphism.

compare_atom_stereochemistry : bool, optional

If False, atoms’ stereochemistry is ignored for the purpose of determining equality. Default is True.

compare_bond_stereochemistry : bool, optional

If False, bonds’ stereochemistry is ignored for the purpose of determining equality. Default is True.

Returns:
molecules_are_isomorphic : bool
n_angles

int: number of angles in the Molecule.

n_atoms

The number of Atom objects.

n_bonds

The number of Bond objects.

n_conformers

Iterate over all Atom objects.

n_impropers

int: number of improper torsions in the Molecule.

n_particles

The number of Particle objects, which corresponds to how many positions must be used.

n_propers

int: number of proper torsions in the Molecule.

n_virtual_sites

The number of VirtualSite objects.

name

The name (or title) of the molecule

partial_charges

Returns the partial charges (if present) on the molecule

Returns:
partial_charges : a simtk.unit.Quantity - wrapped numpy array [1 x n_atoms]

The partial charges on this Molecule’s atoms.

particles

Iterate over all Particle objects.

propers

Iterate over all proper torsions in the molecule

Todo

  • Do we need to return a Torsion object that collects information about fractional bond orders?
properties

The properties dictionary of the molecule

to_bson()

Return a BSON serialized representation.

Specification: http://bsonspec.org/

Returns:
serialized : bytes

A BSON serialized representation of the objecft

to_dict()

Return a dictionary representation of the molecule.

Todo

  • Document the representation standard.
  • How do we do version control with this standard?
Returns:
molecule_dict : OrderedDict

A dictionary representation of the molecule.

to_file(outfile, outfile_format, toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Write the current molecule to a file or file-like object

Parameters:
outfile : str or file-like object

A file-like object or the filename of the file to be written to

outfile_format : str

Format specifier, one of [‘MOL2’, ‘MOL2H’, ‘SDF’, ‘PDB’, ‘SMI’, ‘CAN’, ‘TDT’] Note that not all toolkits support all formats

toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper,
optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for file writing. If a Toolkit is passed, only the highest-precedence toolkit is used

Raises:
ValueError

If the requested outfile_format is not supported by one of the installed cheminformatics toolkits

Examples

>>> molecule = Molecule.from_iupac('imatinib')
>>> molecule.to_file('imatinib.mol2', outfile_format='mol2')  # doctest: +SKIP
>>> molecule.to_file('imatinib.sdf', outfile_format='sdf')  # doctest: +SKIP
>>> molecule.to_file('imatinib.pdb', outfile_format='pdb')  # doctest: +SKIP
to_iupac()

Generate IUPAC name from Molecule

Returns:
iupac_name : str

IUPAC name of the molecule

.. note :: This method requires the OpenEye toolkit to be installed.

Examples

>>> from openforcefield.utils import get_data_filename
>>> sdf_filepath = get_data_filename('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> iupac_name = molecule.to_iupac()
to_json(indent=None)

Return a JSON serialized representation.

Specification: https://www.json.org/

Parameters:
indent : int, optional, default=None

If not None, will pretty-print with specified number of spaces for indentation

Returns:
serialized : str

A JSON serialized representation of the object

to_messagepack()

Return a MessagePack representation.

Specification: https://msgpack.org/index.html

Returns:
serialized : bytes

A MessagePack-encoded bytes serialized representation of the object

to_networkx()

Generate a NetworkX undirected graph from the Molecule.

Nodes are Atoms labeled with particle indices and atomic elements (via the element node atrribute). Edges denote chemical bonds between Atoms. Virtual sites are not included, since they lack a concept of chemical connectivity.

Todo

  • Do we need a from_networkx() method? If so, what would the Graph be required to provide?
  • Should edges be labeled with discrete bond types in some aromaticity model?
  • Should edges be labeled with fractional bond order if a method is specified?
  • Should we add other per-atom and per-bond properties (e.g. partial charges) if present?
  • Can this encode bond/atom chirality?
Returns:
graph : networkx.Graph

The resulting graph, with nodes (atoms) labeled with atom indices, elements, stereochemistry and aromaticity flags and bonds with two atom indices, bond order, stereochemistry, and aromaticity flags

Examples

Retrieve the bond graph for imatinib (OpenEye toolkit required)

>>> molecule = Molecule.from_iupac('imatinib')
>>> nxgraph = molecule.to_networkx()
to_openeye(aromaticity_model='OEAroModel_MDL')

Create an OpenEye molecule

Requires the OpenEye toolkit to be installed.

Todo

  • Use stored conformer positions instead of an argument.
  • Should the aromaticity model be specified in some other way?
Parameters:
aromaticity_model : str, optional, default=DEFAULT_AROMATICITY_MODEL

The aromaticity model to use

Returns:
oemol : openeye.oechem.OEMol

An OpenEye molecule

Examples

Create an OpenEye molecule from a Molecule

>>> molecule = Molecule.from_smiles('CC')
>>> oemol = molecule.to_openeye()
to_pickle()

Return a pickle serialized representation.

Warning

This is not recommended for safe, stable storage since the pickle specification may change between Python versions.

Returns:
serialized : str

A pickled representation of the object

to_rdkit(aromaticity_model='OEAroModel_MDL')

Create an RDKit molecule

Requires the RDKit to be installed.

Parameters:
aromaticity_model : str, optional, default=DEFAULT_AROMATICITY_MODEL

The aromaticity model to use

Returns:
rdmol : rkit.RDMol

An RDKit molecule

Examples

Convert a molecule to RDKit

>>> from openforcefield.utils import get_data_filename
>>> sdf_filepath = get_data_filename('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> rdmol = molecule.to_rdkit()
to_smiles(toolkit_registry=<openforcefield.utils.toolkits.ToolkitRegistry object>)

Return a canonical isomeric SMILES representation of the current molecule

Note

RDKit and OpenEye versions will not necessarily return the same representation.

Parameters:
toolkit_registry : openforcefield.utils.toolkits.ToolRegistry or openforcefield.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES conversion

Returns:
smiles : str

Canonical isomeric explicit-hydrogen SMILES

Examples

>>> from openforcefield.utils import get_data_filename
>>> sdf_filepath = get_data_filename('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> smiles = molecule.to_smiles()
to_toml()

Return a TOML serialized representation.

Specification: https://github.com/toml-lang/toml

Returns:
serialized : str

A TOML serialized representation of the object

to_topology()

Return an openforcefield Topology representation containing one copy of this molecule

Returns:
topology : openforcefield.topology.Topology

A Topology representation of this molecule

Examples

>>> molecule = Molecule.from_iupac('imatinib')
>>> topology = molecule.to_topology()
to_xml(indent=2)

Return an XML representation.

Specification: https://www.w3.org/XML/

Parameters:
indent : int, optional, default=2

If not None, will pretty-print with specified number of spaces for indentation

Returns:
serialized : bytes

A MessagePack-encoded bytes serialized representation.

to_yaml()

Return a YAML serialized representation.

Specification: http://yaml.org/

Returns:
serialized : str

A YAML serialized representation of the object

torsions

Get an iterator over all i-j-k-l torsions. Note that i-j-k-i torsions (cycles) are excluded.

Returns:
torsions : iterable of 4-Atom tuples
total_charge

Return the total charge on the molecule

virtual_sites

Iterate over all VirtualSite objects.

add_atom(atomic_number, formal_charge, is_aromatic, stereochemistry=None, name=None)[source]

Add an atom

Parameters:
atomic_number : int

Atomic number of the atom

formal_charge : int

Formal charge of the atom

is_aromatic : bool

If True, atom is aromatic; if False, not aromatic

stereochemistry : str, optional, default=None

Either ‘R’ or ‘S’ for specified stereochemistry, or None if stereochemistry is irrelevant

name : str, optional, default=None

An optional name for the atom

Returns:
index : int

The index of the atom in the molecule

Examples

Define a methane molecule

>>> molecule = Molecule()
>>> molecule.name = 'methane'
>>> C = molecule.add_atom(6, 0, False)
>>> H1 = molecule.add_atom(1, 0, False)
>>> H2 = molecule.add_atom(1, 0, False)
>>> H3 = molecule.add_atom(1, 0, False)
>>> H4 = molecule.add_atom(1, 0, False)
>>> bond_idx = molecule.add_bond(C, H1, False, 1)
>>> bond_idx = molecule.add_bond(C, H2, False, 1)
>>> bond_idx = molecule.add_bond(C, H3, False, 1)
>>> bond_idx = molecule.add_bond(C, H4, False, 1)
add_bond_charge_virtual_site(atoms, distance, charge_increments=None, weights=None, epsilon=None, sigma=None, rmin_half=None, name='')[source]

Create a bond charge-type virtual site, in which the location of the charge is specified by the position of two atoms. This supports placement of a virtual site S along a vector between two specified atoms, e.g. to allow for a sigma hole for halogens or similar contexts. With positive values of the distance, the virtual site lies outside the first indexed atom. Parameters ———- atoms : list of openforcefield.topology.molecule.Atom objects or ints of shape [N

The atoms defining the virtual site’s position or their indices

distance : float

weights : list of floats of shape [N] or None, optional, default=None
weights[index] is the weight of particles[index] contributing to the position of the virtual site. Default is None
charge_increments : list of floats of shape [N], optional, default=None
The amount of charge to remove from the VirtualSite’s atoms and put in the VirtualSite. Indexing in this list should match the ordering in the atoms list. Default is None.
epsilon : float
Epsilon term for VdW properties of virtual site. Default is None.
sigma : float, default=None
Sigma term for VdW properties of virtual site. Default is None.
rmin_half : float
Rmin_half term for VdW properties of virtual site. Default is None.
name : string or None, default=’‘
The name of this virtual site. Default is ‘’.
Returns:
index : int

The index of the newly-added virtual site in the molecule

add_monovalent_lone_pair_virtual_site(atoms, distance, out_of_plane_angle, in_plane_angle, **kwargs)[source]

Create a bond charge-type virtual site, in which the location of the charge is specified by the position of three atoms.

TODO : Do “weights” have any meaning here?

Parameters:
atoms : list of three openforcefield.topology.molecule.Atom objects or ints

The three atoms defining the virtual site’s position or their molecule atom indices

distance : float
out_of_plane_angle : float
in_plane_angle : float
epsilon : float

Epsilon term for VdW properties of virtual site. Default is None.

sigma : float, default=None

Sigma term for VdW properties of virtual site. Default is None.

rmin_half : float

Rmin_half term for VdW properties of virtual site. Default is None.

name : string or None, default=’‘

The name of this virtual site. Default is ‘’.

Returns:
index : int

The index of the newly-added virtual site in the molecule

add_divalent_lone_pair_virtual_site(atoms, distance, out_of_plane_angle, in_plane_angle, **kwargs)[source]

Create a divalent lone pair-type virtual site, in which the location of the charge is specified by the position of three atoms.

TODO : Do “weights” have any meaning here?

Parameters:
atoms : list of 3 openforcefield.topology.molecule.Atom objects or ints

The three atoms defining the virtual site’s position or their molecule atom indices

distance : float
out_of_plane_angle : float
in_plane_angle : float
epsilon : float

Epsilon term for VdW properties of virtual site. Default is None.

sigma : float, default=None

Sigma term for VdW properties of virtual site. Default is None.

rmin_half : float

Rmin_half term for VdW properties of virtual site. Default is None.

name : string or None, default=’‘

The name of this virtual site. Default is ‘’.

Returns:
index : int

The index of the newly-added virtual site in the molecule

add_trivalent_lone_pair_virtual_site(atoms, distance, out_of_plane_angle, in_plane_angle, **kwargs)[source]

Create a trivalent lone pair-type virtual site, in which the location of the charge is specified by the position of four atoms.

TODO : Do “weights” have any meaning here?

Parameters:
atoms : list of 4 openforcefield.topology.molecule.Atom objects or ints

The three atoms defining the virtual site’s position or their molecule atom indices

distance : float
out_of_plane_angle : float
in_plane_angle : float
epsilon : float

Epsilon term for VdW properties of virtual site. Default is None.

sigma : float, default=None

Sigma term for VdW properties of virtual site. Default is None.

rmin_half : float

Rmin_half term for VdW properties of virtual site. Default is None.

name : string or None, default=’‘

The name of this virtual site. Default is ‘’.

Returns:
index : int

The index of the newly-added virtual site in the molecule

add_bond(atom1, atom2, bond_order, is_aromatic, stereochemistry=None, fractional_bond_order=None)[source]

Add a bond between two specified atom indices

Parameters:
atom1 : int or openforcefield.topology.molecule.Atom

Index of first atom

atom2 : int or openforcefield.topology.molecule.Atom

Index of second atom

bond_order : int

Integral bond order of Kekulized form

is_aromatic : bool

True if this bond is aromatic, False otherwise

stereochemistry : str, optional, default=None

Either ‘E’ or ‘Z’ for specified stereochemistry, or None if stereochemistry is irrelevant

fractional_bond_order : float, optional, default=None

The fractional (eg. Wiberg) bond order

Returns:
index: int

Index of the bond in this molecule

add_conformer(coordinates)[source]

# TODO: Should this not be public? Adds a conformer of the molecule

Parameters:
coordinates: simtk.unit.Quantity(np.array) with shape (n_atoms, 3)

Coordinates of the new conformer, with the first dimension of the array corresponding to the atom index in the Molecule’s indexing system.

Returns
——-
index: int

Index of the conformer in the Molecule