Information of descriptors in 3DMET


Contents
Entry Name Formula Molecular Weight InChI SMILESTM CASRN®
Formal Charge Weight LogP(o/w) SlogP LogS SMR Volume
ASA VSA TPSA Density Diameter Dipole Globularity
Potential energy Number of Atoms Bonds Rings

Entry

The ENTRY number corresponds to each structure. The description is "B00000" (0 means number).

Name

Compound names. This description corresponds to KEGG COMPOUND. Please refer to the GenomeNet server for greater details.
Additional descriptions in the parenthses are follows.

Formula

Molecular formula. For example, benzene is described as "C6H6" and isopropanol is "C3H7O".
The order of atoms is C H N O P S in this version.

Molecular Weight and Weight

Two parameters are almost the same except for calculation method.
  • Molecular Weight: The value described in KEGG COMPOUND.
  • Weight: A value calculated by MOETM.

InChI

InChI is a string describing molecules that was originally developed by IUPAC. For the detail, see the InChI homepage of IUPAC.

SMILESTM

SMILESTM is Simplified Molecular Input Line Entry System. This is a chemical language to describe a molecule in terms of a string. For example, benzene is described as "c1ccccc1" and isopropanol is "CC(C)O".
The canonical SMILES is calculated by the Daylight SMILES program.

CASRN®

The CAS (Chemical Abstract Service) Registry Number® of the molecule, if the compound has the number.
CAS Registry Number® (CASRN®) is a Registered Trademark of the American Chemical Society.

Formal Charge

Total charge of the molecule (sum of formal charges).
Calculated by MOETM.

LogP(o/w) and SlogP

Both values show predicted LogP (octanol-water partition coefficient) values.
These two descriptors are calculated by MOETM.
  • LogP(o/w)
      Log of the octanol/water partition coefficient (including implicit hydrogens). This property is calculated from a linear atom type model (Labute, 1998) with r2 = 0.931, RMSE=0.393 of 1,827 molecules.
  • SlogP
      Log of the octanol/water partition coefficient (including implicit hydrogens). This property is an atomic contribution model (Wildman and Crippen, 1999) that calculates logP from the given structure; i.e., the correct protonation state (washed structures). Results may vary from the logP(o/w) descriptor. The training set for SlogP was ~7000 structures.

LogS

Log of the aqueous solubility (mol/L). This property is calculated from an atom contribution linear atom type model (Hou 2004) with r2 = 0.90, ~1,200 molecules.
This descriptor is calculated by MOETM.

SMR

Molecular refractivity (including implicit hydrogens). This property is an atomic contribution model (Wildman and Crippen, 1999) that assumes the correct protonation state (washed structures). The model was trained on ~7000 structures and results may vary from the mr descriptor.
This descriptor is calculated by MOETM.

Volume

Van der Waals volume calculated using a grid approximation (spacing 0.75 Å).
This molecular volume is based on the 3D-structures of 3DMET calculated by MOETM.

ASA

Water accessible surface area calculated using a radius of 1.4 Å for the water molecule. A polyhedral representation is used for each atom in calculating the surface area.
This descriptor is calculated by MOETM.

VSA

Van der Waals surface area. A polyhedral representation is used for each atom in calculating the surface area.
This descriptor is calculated by MOETM.

TPSA

Polar surface area calculated using group contributions with the parameters of Ertl et al (2000).
This descriptor is calculated by MOETM.

Density

Molecular mass density: weight divided by vdw_vol (amu/Å3).
This descriptor is calculated by MOETM.

Diameter

Largest value in the distance matrix (Petitjean, 1992).
This descriptor is calculated by MOETM.

Dipole

Dipole moment calculated from the partial charges of the molecule.
This descriptor is calculated by MOETM.

Globularity

Globularity or inverse condition number (smallest eigenvalue divided by the largest eigenvalue) of the covariance matrix of atomic coordinates. A value of 1 indicates a perfect sphere while a value of 0 indicates a two- or one-dimensional object.
This descriptor is calculated by MOETM.

Potential energy

These values are calculated by MOETM.
  • Total energy: value of the potential energy (sum of the following energy terms).
  • Angle bend energy: angle bend potential energy.
  • Electrostatic energy: electrostatic component of the potential energy.
  • Non-bond energy: value of the potential energy with all bonded terms disabled.
  • Out-of-plane energy: out-of-plane potential energy.
  • Solvation energy.
  • Bond stretch-bend energy: bond stretch-bend cross-term potential energy.
  • Torsion energy: torsion (proper and improper) potential energy.
  • Van del Waals energy: van der Waals component of the potential energy.

Number of atoms

These values are calculated by MOETM.
  • Chiral atoms: the number of chiral centers.
  • H-bond acceptor: number of hydrogen bond acceptor atoms (not counting acidic atoms but counting atoms that are both hydrogen bond donors and acceptors such as -OH).
  • H-bond donor: number of hydrogen bond donor atoms (not counting basic atoms but counting atoms that are both hydrogen bond donors and acceptors such as -OH).
  • Acidic atoms: number of acidic atoms.
  • Basic atoms: number of basic atoms.
  • Aromatic atoms: number of aromatic atoms.
  • Heavy atoms: number of non-hydrogen atoms.

Number of bonds

These values are calculated by MOETM.
  • Single bonds: number of single bonds (including implicit hydrogens). Aromatic bonds are not considered to be single bonds.
  • Rotatable single bonds: number of rotatable single bonds. Conjugated single bonds are not included (e.g., ester and peptide bonds).
  • Double bonds: number of double bonds. Aromatic bonds are not considered to be double bonds.
  • Triple bonds: number of triple bonds. Aromatic bonds are not considered to be triple bonds.
  • Aromatic bonds: number of aromatic bonds.

Number of rings

The number of rings in the molecule. If these rings are not independent, each ring is counted.
This descriptor is calculated by MOETM.

COMPOUND_number

The number corresponds to the KEGG COMPOUND entry number. The description is "C00000" (0 means any numbers). Please refer to the GenomeNet server for the detail.


Literature

  • COMPOUND: Goto, S., Nishioka, T., and Kanehisa, M. "LIGAND: Chemical Database for Enzyme Reactions." Bioinformatics, 14 591-599 (1998)
  • SMILESTM: Weininger, D. "SMILES, A Chemical Language and Information System. 1. Introduction of Methodology and Encoding Rules." J. Chem. Inf. Comput. Sci., 28, 31-36 (1988)
  • InChI: McNaught, A. "The INPAC International Chemical Identifier (InChI)." 43rd IUPAC General Assmbly (2005).     -> poster
  • LogP: Labute, P. MOE LogP(Octanol/Water) Model (1998).
  • SlogP and SMR: Wildman, S.A. and Crippen, G.M. Prediction of Physiochemical Parameters by Atomic Contributions. J. Chem. Inf. Comput. Sci. 39, No. 5, 868-873 (1999).
  • Diameter: Petitjean, M. "Applications of the Radius-Diameter Diagram to the Classification of Topological and Geometrical Shapes of Chemical Compounds." J. Chem. Inf. Comput. Sci. 32, 331-337 (1992).
  • TPSA:Ertl, P., Rohde, B., Selzer, P. Fast Calculation of Molecular Polar Surface Area as a Sum of Fragment-Based Contributions and Its Application to the Prediction of Drug Transport Properties. J. Med. Chem. 43, 3714-3717 (2000).

Programs